| 1 |
IVOA Registry-in-a-Box (RiaBox) |
|---|
| 2 |
|
|---|
| 3 |
This package provides a simple implementation of a IVOA Publishing |
|---|
| 4 |
Registry that is compliant with the harvesting portion IVOA Registry |
|---|
| 5 |
Interfaces standard, v1.0 |
|---|
| 6 |
(http://www.ivoa.net/Documents/latest/RegistryInterfaces, section 3). |
|---|
| 7 |
It has been adapted from the "OAI-PMH2 XMLFile data provider" Perl |
|---|
| 8 |
library by Hussein Suleman, developed at Virginia Tech |
|---|
| 9 |
(www.dlib.vt.edu) and, thus, is fully compliant with the OAI-PMH v2.0 |
|---|
| 10 |
standard (http://www.openarchive.org/). |
|---|
| 11 |
|
|---|
| 12 |
----------------------------------------------------------------- |
|---|
| 13 |
Summary of how it works |
|---|
| 14 |
----------------------------------------------------------------- |
|---|
| 15 |
|
|---|
| 16 |
It assumes that the records are stored as flat XML files in a |
|---|
| 17 |
directory, organized into sub-directories according to OAI sets. |
|---|
| 18 |
These records can be served directly in their native format or |
|---|
| 19 |
transformed on the fly into another format using XSLT. It can be |
|---|
| 20 |
configured completely with configuration files. |
|---|
| 21 |
|
|---|
| 22 |
The typical practice is to store files natively in the VOResource |
|---|
| 23 |
metadata v1.0 standard (www.ivoa.net/documents/latest/VOResource). |
|---|
| 24 |
(Note, however, that some users of this library have hacked the code to |
|---|
| 25 |
retrieve records directly from a local database.) The Dublin Core |
|---|
| 26 |
format is required by the OAI standard; thus, a stylesheet to do the |
|---|
| 27 |
conversion is found in the conf directory. |
|---|
| 28 |
|
|---|
| 29 |
----------------------------------------------------------------- |
|---|
| 30 |
Installation |
|---|
| 31 |
----------------------------------------------------------------- |
|---|
| 32 |
|
|---|
| 33 |
Summary of Installation: |
|---|
| 34 |
1. Download xsltproc and install the xsltproc tool |
|---|
| 35 |
2. Unpack the RofR distribution. |
|---|
| 36 |
3. Install the cgi-bin directory into your web server |
|---|
| 37 |
4. Configure and Run |
|---|
| 38 |
|
|---|
| 39 |
Here are the details... |
|---|
| 40 |
|
|---|
| 41 |
1. Download xsltproc and install the xsltproc tool |
|---|
| 42 |
|
|---|
| 43 |
xlstproc is used to transform resource description records into other |
|---|
| 44 |
formats, namely OAI-Dublin Core. |
|---|
| 45 |
|
|---|
| 46 |
This package contains in the bin directory a precompiled version of |
|---|
| 47 |
the xsltproc tool built for a RedHat Linux system, but most likely you |
|---|
| 48 |
will have to rebuild this your system. To check, just execute |
|---|
| 49 |
bin/xsltproc without arguments; if it spits out a help message, you |
|---|
| 50 |
are probably good to go. |
|---|
| 51 |
|
|---|
| 52 |
If you need to rebuild, download it from |
|---|
| 53 |
http://xmlsoft.org/XSLT/xsltproc2.html and follow the instructions |
|---|
| 54 |
found there. |
|---|
| 55 |
|
|---|
| 56 |
It can be installed anywhere, but if you do not put it in the bin |
|---|
| 57 |
directory, you will have to update the configuration file |
|---|
| 58 |
accordingly (see step 3.). |
|---|
| 59 |
|
|---|
| 60 |
2. Unpack the RofR distribution. |
|---|
| 61 |
|
|---|
| 62 |
The RofR directory can be placed anywhere. Consider this location the |
|---|
| 63 |
"installation directory". |
|---|
| 64 |
|
|---|
| 65 |
3. Install the cgi-bin directory into your web server |
|---|
| 66 |
|
|---|
| 67 |
The easiest way to do this is to place a link in your server's cgi-bin |
|---|
| 68 |
directory that points to RiaBox's cgi-bin directory. |
|---|
| 69 |
|
|---|
| 70 |
For example, if you installed RiaBox as /opt/apps/RiaBox and your web |
|---|
| 71 |
server is installed in /opt/apps/httpd, then you might type: |
|---|
| 72 |
|
|---|
| 73 |
cd /opt/apps/httpd/cgi-bin |
|---|
| 74 |
ln -s /opt/apps/RiaBox/cgi-bin riabox |
|---|
| 75 |
|
|---|
| 76 |
Then the OAI interface is accessible via something like: |
|---|
| 77 |
|
|---|
| 78 |
http://yourserver.org/cgi-bin/riabox/oai.pl |
|---|
| 79 |
|
|---|
| 80 |
4. Configure and Run |
|---|
| 81 |
|
|---|
| 82 |
If you xsltproc is installed in the RiaBox/bin directory and your web |
|---|
| 83 |
server has a link to RiaBox/cgi-bin (as described in step 3.), then you |
|---|
| 84 |
are ready to run. Otherwise, you will need to consult the next |
|---|
| 85 |
section on configuring the package. |
|---|
| 86 |
|
|---|
| 87 |
----------------------------------------------------------------- |
|---|
| 88 |
Configuration |
|---|
| 89 |
----------------------------------------------------------------- |
|---|
| 90 |
|
|---|
| 91 |
---++ Installing the CGI-BIN directory |
|---|
| 92 |
|
|---|
| 93 |
The first thing the OAI script, cgi-bin/oai.pl, does is to change into |
|---|
| 94 |
the directory where RiaBox is installed; this allows it to easily find |
|---|
| 95 |
all of its various files. If you do not use the link trick, then you |
|---|
| 96 |
may need to edit the oai.pl script. |
|---|
| 97 |
|
|---|
| 98 |
In this case, you can update oai.pl: |
|---|
| 99 |
|
|---|
| 100 |
o edit the "chdir" line: remove the "$FindBin::Bin/.." and enter |
|---|
| 101 |
the path to the RiaBox directory into double quotes. |
|---|
| 102 |
|
|---|
| 103 |
o if the conf directory is not found in the directory the script |
|---|
| 104 |
changes into, edit the "my $OAI = ..." line by changing |
|---|
| 105 |
"conf/config.xml" to the path to the configuration file. |
|---|
| 106 |
|
|---|
| 107 |
If you have to edit this path, then it is likely that you will |
|---|
| 108 |
have to similarly edit the paths found within this file. |
|---|
| 109 |
|
|---|
| 110 |
---++ Editing the config.xml file |
|---|
| 111 |
|
|---|
| 112 |
Tips: |
|---|
| 113 |
o order of the parameters does not matter (but <metadata> |
|---|
| 114 |
sub-parameters must be kept together). |
|---|
| 115 |
|
|---|
| 116 |
o avoid extraneous spaces inside XML tags. |
|---|
| 117 |
|
|---|
| 118 |
Here are the meanings of the various parameters: |
|---|
| 119 |
|
|---|
| 120 |
<repositoryName> The title of the repository, exposed via the OAI |
|---|
| 121 |
Identify verb. |
|---|
| 122 |
|
|---|
| 123 |
<adminEmail> The administrator's email address; exposed via the OAI |
|---|
| 124 |
Identify verb. |
|---|
| 125 |
|
|---|
| 126 |
<recordLimit> The maximum number of records that will be |
|---|
| 127 |
exported in one OAI call; if more are records are |
|---|
| 128 |
available a resumption token is issued to the |
|---|
| 129 |
user. |
|---|
| 130 |
|
|---|
| 131 |
<datadir> the directory containing the VOResource |
|---|
| 132 |
descriptions to expose. |
|---|
| 133 |
|
|---|
| 134 |
<deletedRecord> The policy for handling deleted records; in the |
|---|
| 135 |
IVOA context, this should be kept set to |
|---|
| 136 |
persistant. See the OAI standard |
|---|
| 137 |
(www.openarchives.org) for more details. |
|---|
| 138 |
|
|---|
| 139 |
<filematch> A perl regular expression that XML files |
|---|
| 140 |
containing VOResource descriptions must match in |
|---|
| 141 |
order to be exported. |
|---|
| 142 |
|
|---|
| 143 |
<confdir> The directory containing the configuration files, |
|---|
| 144 |
setnames.xml and identity.xml. |
|---|
| 145 |
|
|---|
| 146 |
<metadata> parameters for supporting different export formats |
|---|
| 147 |
|
|---|
| 148 |
<prefix> the format identifier as accepted by the OAI |
|---|
| 149 |
metadataPrefix parameter. |
|---|
| 150 |
|
|---|
| 151 |
<namespace> the XML namespace for the format. |
|---|
| 152 |
|
|---|
| 153 |
<schema> the location of the XML schema for the format. |
|---|
| 154 |
This is needed by xsltproc. |
|---|
| 155 |
|
|---|
| 156 |
<transform> the command for transforming VOResource records |
|---|
| 157 |
into other formats. This command must accept a |
|---|
| 158 |
VOResource record via standard input and send |
|---|
| 159 |
the output format to standard output. |
|---|
| 160 |
|
|---|
| 161 |
---++ Changing the Identity description |
|---|
| 162 |
|
|---|
| 163 |
The OAI Identity operation will expose the VOResource record for this |
|---|
| 164 |
Registry. This record is stored as a VOResource XML file called |
|---|
| 165 |
identity.xml in the configuration directory (see <confdir>). Because |
|---|
| 166 |
this description is exported with all the other resource descriptions |
|---|
| 167 |
by other OAI operations, this file by default is a link that points to |
|---|
| 168 |
the proper record in the data directory (see <datadir>). To change |
|---|
| 169 |
this record just edit the file pointed to by this link. |
|---|
| 170 |
|
|---|
| 171 |
You can added addition descriptions in any XML format; just call them |
|---|
| 172 |
identity*.xml (e.g. identity2.xml) and place them in the conf |
|---|
| 173 |
directory. |
|---|
| 174 |
|
|---|
| 175 |
---++ Changing Set Descriptions and Membership |
|---|
| 176 |
|
|---|
| 177 |
The OAI standard allows records to be organized into "sets" that allow |
|---|
| 178 |
harvesters to retrieve only the portion of the records that falls into |
|---|
| 179 |
specified set. RiaBox, for example, supports a special set mandated |
|---|
| 180 |
by the IVOA Registry Interfaces standard called ivo_managed (see |
|---|
| 181 |
conf/setnames-sample.xml for details). This package allows you to |
|---|
| 182 |
configure the description and membership of sets; this is done via the |
|---|
| 183 |
setnames.xml file and/or various _set_ files. The description is |
|---|
| 184 |
subsequently exposed via the OAI ListSets operation. |
|---|
| 185 |
|
|---|
| 186 |
First note that in order to define a set, there must be a directory in |
|---|
| 187 |
the data directory (see <datadir>) whose name matches the set |
|---|
| 188 |
identifier (i.e. that a harvester passes to an OAI operation via the |
|---|
| 189 |
set parameter). If the directory doesn't exist, the set will not be |
|---|
| 190 |
supported. That directory, though, can be empty. Subdirectories |
|---|
| 191 |
define sub-sets, too. If no configuration file describing a set |
|---|
| 192 |
exists, the OAI service will provide a trivial description. When a |
|---|
| 193 |
client requests a particular set, the OAI service, by default, will |
|---|
| 194 |
return all of the records found in the directory of the same name (and |
|---|
| 195 |
those in its subdirectories, recursively). |
|---|
| 196 |
|
|---|
| 197 |
A set can be more fully described either with a setnames.xml file in |
|---|
| 198 |
the configuration directory (see <confdir>) or with a special file |
|---|
| 199 |
called _set_ which resides in the set directory itself. Both use |
|---|
| 200 |
essentially the same XML format for its data. If both exist, the |
|---|
| 201 |
description in the setnames.xml file takes precedence. |
|---|
| 202 |
|
|---|
| 203 |
The setnames.xml file contains a root element <setnames> which |
|---|
| 204 |
contains inside it a <set> element for each set to be described. The |
|---|
| 205 |
_set_ file contains just a single <set> element. Here are the |
|---|
| 206 |
meanings of the elements it contains (in any order): |
|---|
| 207 |
|
|---|
| 208 |
<spec> the identifier for the set (used by harvesters via the |
|---|
| 209 |
set= parameter). _set_ files should *not* include |
|---|
| 210 |
this element. |
|---|
| 211 |
|
|---|
| 212 |
<name> a longer title for the set indicating the contents |
|---|
| 213 |
|
|---|
| 214 |
<description> a longer description of the set and why it might be |
|---|
| 215 |
used. |
|---|
| 216 |
|
|---|
| 217 |
<includedIn> This adds the members of this set to the set named in |
|---|
| 218 |
this element. |
|---|
| 219 |
|
|---|
| 220 |
Notes: |
|---|
| 221 |
o A <set> description in setnames.xml overrides the one found in |
|---|
| 222 |
the set directory's _set_ file. |
|---|
| 223 |
|
|---|
| 224 |
o Do include a <spec> element in a _set_ file. If it does include |
|---|
| 225 |
one, it must match the set name for the directory it lives in or |
|---|
| 226 |
it will otherwise be ignored. |
|---|
| 227 |
|
|---|
| 228 |
o Start a _set_ file with a singe <set> element. It's actually |
|---|
| 229 |
okay if it starts with a wrapping <setnames> element with a |
|---|
| 230 |
single <set> child element; however, if multiple <set> elements |
|---|
| 231 |
appear, you will probably get unexpected results. |
|---|
| 232 |
|
|---|
| 233 |
|
|---|