Using GeoCombine to harvest and index OpenGeoMetadata
A quick tutorial on how to harvest and index OpenGeoMetadata for your GeoBlacklight installation.
Warning
This tutorial from 2015 may be outdated. Please refer to the GeoCombine repo for up to date instructions.
Sharing, collaborating, and harvesting geospatial metadata is not really easy. A recent development in the world of geospatial metadata sharing is the new project OpenGeoMetadata. OpenGeoMetadata aims to be a shared repository for institutions looking to share, collaborate, and harvest geospatial metadata. For more details on how the project is structured and why we think this is really cool, see this readme.
We started work on software focused on harvesting, converting, and indexing this metadata called GeoCombine.
GeoCombine - A ruby toolkit for geospatial metadata
GeoCombine is envisioned as an easy to use toolkit for metadata conversions with integration into applications and projects like GeoBlacklight, GeoMonitor, and OpenGeoMetadata.
Currently (as of 2015-02-05), GeoCombine really just does three things:
- Clones OpenGeoMetadata repositories
- Updates the local cloned repositories (using
git pull
) - Indexes into Solr
geoblacklight.xml
files from the cloned repositories
Getting started
This guide assumes a few things already.
- You have Git installed
- You have Ruby installed
- You have Solr running locally on port 8983 (default Solr port) and it is configured with GeoBlacklight-Schema configuration
Install GeoCombine
If you have already have a GeoBlacklight application, skip steps 1 and 2. You can just add gem 'geo_combine'
to your GeoBlacklight application's Gemfile
-
To get started, first clone the GeoCombine repository
$ git clone https://github.com/OpenGeoMetadata/GeoCombine.git
-
Switch to its folder
$ cd GeoCombine
-
Install GeoCombine's dependencies
$ bundle install
-
Create a
tmp
directory (if it doesn't already exist)$ mkdir tmp
-
Clone all of the 'edu.*' repositories to tmp.
$ rake geocombine:clone
Since other software projects live in OpenGeoMetadata we only want to clone the metadata repositories. All of these are currently namespaced with "edu.institution.subdomain".
-
Index all of the
geoblacklight.xml
documents located in cloned repositories.$ rake geocombine:index
Go grab a coffee or lunch, because this might take a while! But afterwards your index should have +30,000 new records in it.