The British Caving Association

Any views expressed are not necessarily those of the BCA
It is currently Mon 20 Nov 2017 06:50

All times are UTC [ DST ]




Post new topic Reply to topic  [ 1 post ] 
Author Message
PostPosted: Tue 04 Mar 2014 11:56 
Offline
User avatar

Joined: Thu 16 Mar 2006 23:45
Posts: 458
Help needed in putting Speleology back-issues online

Summary: We want to put Speleology back-issues online (probably for free access). It would be helpful if someone could first type-up a list of contents of each issue, together with the articles' standfirst text, for our database. A related task is to re-compile the PDFs to an ebook size and to chop them up into individual articles; naming them in accordance with our database convention. For this task you will need Acrobat, or an equivalent program that can re-distil PDF files.

Details: Database
The database format for our online content is a set of plain text files in a format based closely on Endnote. The specification is online but, as a first step, it is probably simpler to ask someone to assemble the data in a tabular form, either in Excel or as a table in MS Word. I would strongly suggest that you use a table in Word, for preference, because it is far easier to edit, format and view than in Excel. For each issue of the magazine, we need a separate table with columns as follows...

  • Page range (e.g. 7-12)
  • Article type (Feature or left blank)
  • Title of Article
  • Authors
  • Standfirst Text
This data is not intended to be human-readable on our web site; instead it is going to be interpreted by a set of programs in order to display a set of formatted web pages. It is therefore absolutely essential that it is in exactly the correct format for a machine to interpret correctly but, unless you want to work your way through our specification, the easiest thing is probably for you to work on one issue of Speleology, and then I will explain how what you've done needs to be tweaked to match the specification.

The bulk of the work can probably be done by pasting text from the PDFs (which can be made available to you) to get the article titles and the standfirst text (that is, the summary text that comes after the title of the article). The salient point is that the text must be HTML-safe, so you will need to weed out any non-safe characters or escape them using an HTML entity. (But this could probably be done as a secondary exercise – the main point is to have the text in a table, in an editable format).

Details: PDFs
A related task is to re-compile the PDFs to an ebook size and to chop them up into individual articles; naming them in accordance with our database convention. For this task you will need Acrobat, or an equivalent program that can re-distill PDF files. Not really much more to say about that.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC [ DST ]


Who is online

Users browsing this forum: Bing [Bot] and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group