VORegistryInABox: README

File README, 9.5 kB (added by rplante, 1 year ago)
Line 
1 IVOA Registry-in-a-Box (RiaBox)
2
3 This package provides a simple implementation of a IVOA Publishing
4 Registry that is compliant with the harvesting portion IVOA Registry
5 Interfaces standard, v1.0
6 (http://www.ivoa.net/Documents/latest/RegistryInterfaces, section 3).
7 It has been adapted from the "OAI-PMH2 XMLFile data provider" Perl
8 library by Hussein Suleman, developed at Virginia Tech
9 (www.dlib.vt.edu) and, thus, is fully compliant with the OAI-PMH v2.0
10 standard (http://www.openarchive.org/).
11
12 -----------------------------------------------------------------
13 Summary of how it works
14 -----------------------------------------------------------------
15
16 It assumes that the records are stored as flat XML files in a
17 directory, organized into sub-directories according to OAI sets.
18 These records can be served directly in their native format or
19 transformed on the fly into another format using XSLT.  It can be
20 configured completely with configuration files.
21
22 The typical practice is to store files natively in the VOResource
23 metadata v1.0 standard (www.ivoa.net/documents/latest/VOResource). 
24 (Note, however, that some users of this library have hacked the code to
25 retrieve records directly from a local database.)  The Dublin Core
26 format is required by the OAI standard; thus, a stylesheet to do the
27 conversion is found in the conf directory.
28
29 -----------------------------------------------------------------
30 Installation
31 -----------------------------------------------------------------
32
33 Summary of Installation:
34   1. Download xsltproc and install the xsltproc tool
35   2. Unpack the RofR distribution.
36   3. Install the cgi-bin directory into your web server
37   4. Configure and Run
38
39 Here are the details...
40
41 1.  Download xsltproc and install the xsltproc tool
42
43 xlstproc is used to transform resource description records into other
44 formats, namely OAI-Dublin Core. 
45
46 This package contains in the bin directory a precompiled version of
47 the xsltproc tool built for a RedHat Linux system, but most likely you
48 will have to rebuild this your system.  To check, just execute
49 bin/xsltproc without arguments; if it spits out a help message, you
50 are probably good to go. 
51
52 If you need to rebuild, download it from
53 http://xmlsoft.org/XSLT/xsltproc2.html and follow the instructions
54 found there. 
55
56 It can be installed anywhere, but if you do not put it in the bin
57 directory, you will have to update the configuration file
58 accordingly (see step 3.). 
59
60 2. Unpack the RofR distribution.
61
62 The RofR directory can be placed anywhere.  Consider this location the
63 "installation directory". 
64
65 3. Install the cgi-bin directory into your web server
66
67 The easiest way to do this is to place a link in your server's cgi-bin
68 directory that points to RiaBox's cgi-bin directory. 
69
70 For example, if you installed RiaBox as /opt/apps/RiaBox and your web
71 server is installed in /opt/apps/httpd, then you might type:
72
73    cd /opt/apps/httpd/cgi-bin
74    ln -s /opt/apps/RiaBox/cgi-bin riabox
75
76 Then the OAI interface is accessible via something like:
77
78    http://yourserver.org/cgi-bin/riabox/oai.pl
79
80 4. Configure and Run
81
82 If you xsltproc is installed in the RiaBox/bin directory and your web
83 server has a link to RiaBox/cgi-bin (as described in step 3.), then you
84 are ready to run.  Otherwise, you will need to consult the next
85 section on configuring the package. 
86
87 -----------------------------------------------------------------
88 Configuration
89 -----------------------------------------------------------------
90
91 ---++ Installing the CGI-BIN directory
92
93 The first thing the OAI script, cgi-bin/oai.pl, does is to change into
94 the directory where RiaBox is installed; this allows it to easily find
95 all of its various files.  If you do not use the link trick, then you
96 may need to edit the oai.pl script. 
97
98 In this case, you can update oai.pl:
99
100   o  edit the "chdir" line: remove the "$FindBin::Bin/.." and enter
101      the path to the RiaBox directory into double quotes.
102
103   o  if the conf directory is not found in the directory the script
104      changes into, edit the "my $OAI = ..." line by changing
105      "conf/config.xml" to the path to the configuration file. 
106
107      If you have to edit this path, then it is likely that you will
108      have to similarly edit the paths found within this file. 
109
110 ---++ Editing the config.xml file
111
112 Tips:
113   o  order of the parameters does not matter (but <metadata>
114      sub-parameters must be kept together).
115
116   o  avoid extraneous spaces inside XML tags.
117
118 Here are the meanings of the various parameters: 
119
120   <repositoryName>  The title of the repository, exposed via the OAI
121                     Identify verb.
122
123   <adminEmail>      The administrator's email address; exposed via the OAI
124                     Identify verb.
125
126   <recordLimit>     The maximum number of records that will be
127                     exported in one OAI call; if more are records are
128                     available a resumption token is issued to the
129                     user. 
130
131   <datadir>         the directory containing the VOResource
132                     descriptions to expose. 
133
134   <deletedRecord>   The policy for handling deleted records; in the
135                     IVOA context, this should be kept set to
136                     persistant.  See the OAI standard
137                     (www.openarchives.org) for more details.
138
139   <filematch>       A perl regular expression that XML files
140                     containing VOResource descriptions must match in
141                     order to be exported. 
142
143   <confdir>         The directory containing the configuration files,
144                     setnames.xml and identity.xml. 
145
146   <metadata>        parameters for supporting different export formats
147
148     <prefix>        the format identifier as accepted by the OAI
149                       metadataPrefix parameter. 
150
151     <namespace>     the XML namespace for the format.
152
153     <schema>        the location of the XML schema for the format.
154                       This is needed by xsltproc.
155
156     <transform>     the command for transforming VOResource records
157                       into other formats.  This command must accept a
158                       VOResource record via standard input and send
159                       the output format to standard output. 
160
161 ---++ Changing the Identity description
162
163 The OAI Identity operation will expose the VOResource record for this
164 Registry.  This record is stored as a VOResource XML file called
165 identity.xml in the configuration directory (see <confdir>).  Because
166 this description is exported with all the other resource descriptions
167 by other OAI operations, this file by default is a link that points to
168 the proper record in the data directory (see <datadir>).  To change
169 this record just edit the file pointed to by this link. 
170
171 You can added addition descriptions in any XML format; just call them
172 identity*.xml (e.g. identity2.xml) and place them in the conf
173 directory. 
174
175 ---++ Changing Set Descriptions and Membership
176
177 The OAI standard allows records to be organized into "sets" that allow
178 harvesters to retrieve only the portion of the records that falls into
179 specified set.  RiaBox, for example, supports a special set mandated
180 by the IVOA Registry Interfaces standard called ivo_managed (see
181 conf/setnames-sample.xml for details).  This package allows you to
182 configure the description and membership of sets; this is done via the
183 setnames.xml file and/or various _set_ files.  The description is
184 subsequently exposed via the OAI ListSets operation.
185
186 First note that in order to define a set, there must be a directory in
187 the data directory (see <datadir>) whose name matches the set
188 identifier (i.e. that a harvester passes to an OAI operation via the
189 set parameter).  If the directory doesn't exist, the set will not be
190 supported.  That directory, though, can be empty.  Subdirectories
191 define sub-sets, too.  If no configuration file describing a set
192 exists, the OAI service will provide a trivial description.  When a
193 client requests a particular set, the OAI service, by default, will
194 return all of the records found in the directory of the same name (and
195 those in its subdirectories, recursively). 
196
197 A set can be more fully described either with a setnames.xml file in
198 the configuration directory (see <confdir>) or with a special file
199 called _set_ which resides in the set directory itself.  Both use
200 essentially the same XML format for its data.  If both exist, the
201 description in the setnames.xml file takes precedence. 
202
203 The setnames.xml file contains a root element <setnames> which
204 contains inside it a <set> element for each set to be described.  The
205 _set_ file contains just a single <set> element.  Here are the
206 meanings of the elements it contains (in any order):
207
208 <spec>          the identifier for the set (used by harvesters via the
209                 set= parameter).  _set_ files should *not* include
210                 this element.
211
212 <name>          a longer title for the set indicating the contents
213
214 <description>   a longer description of the set and why it might be
215                 used. 
216
217 <includedIn>    This adds the members of this set to the set named in
218                 this element. 
219
220 Notes:
221   o  A <set> description in setnames.xml overrides the one found in
222      the set directory's _set_ file. 
223
224   o  Do include a <spec> element in a _set_ file.  If it does include
225      one, it must match the set name for the directory it lives in or
226      it will otherwise be ignored. 
227
228   o  Start a _set_ file with a singe <set> element.  It's actually
229      okay if it starts with a wrapping <setnames> element with a
230      single <set> child element; however, if multiple <set> elements
231      appear, you will probably get unexpected results. 
232
233