Using GeoCombine to harvest and index OpenGeoMetadata
Feb 5, 2015
Sharing, collaborating, and harvesting geospatial metadata is not really easy. A recent development in the world of geospatial metadata sharing is the new project OpenGeoMetadata. OpenGeoMetadata aims to be a shared repository for institutions looking to share, collaborate, and harvest geospatial metadata. For more details on how the project is structured and why we think this is really cool, see this readme.
We started work on software focused on harvesting, converting, and indexing this metadata called GeoCombine.
GeoCombine - A ruby toolkit for geospatial metadata
Currently (as of 2015-02-05), GeoCombine really just does three things:
- Clones OpenGeoMetadata repositories
- Updates the local cloned repositories (using
- Indexes into Solr
geoblacklight.xmlfiles from the cloned repositories
This guide assumes a few things already.
- You have Git installed
- You have Ruby installed
- You have Solr running locally on port 8983 (default Solr port) and it is configured with GeoBlacklight-Schema configuration
If you are confused about these prerequisites, it is probably best that you start by running through the workshop “A hands on introduction to GeoBlacklight”.
If you have already have a GeoBlacklight application, skip steps 1 and 2. You can just add
gem 'geo_combine' to your GeoBlacklight application’s
To get started, first clone the GeoCombine repository
$ git clone https://github.com/OpenGeoMetadata/GeoCombine.git
Switch to its folder
$ cd GeoCombine
Install GeoCombine’s dependencies
$ bundle install
tmpdirectory (if it doesn’t already exist)
$ mkdir tmp
Clone all of the ‘edu.*’ repositories to tmp.
$ rake geocombine:clone
Since other software projects live in OpenGeoMetadata we only want to clone the metadata repositories. All of these are currently namespaced with “edu.institution.subdomain”.
Index all of the
geoblacklight.xmldocuments located in cloned repositories.
$ rake geocombine:index
Go grab a coffee or lunch, because this might take a while! But afterwards your index should have +30,000 new records in it.