Selecting subject specific records from the Bielefeld Academic Search Engine (Part 1)

Introduction Since 2004, the Bielefeld Academic Search Engine (BASE) offers an aggregated metadata search over scientific publications. By indexing well over 300 million records (60% of which are open access1) from almost 10,000 international repositories, it provides researchers with an invaluable access to publications. In addition to offering a VuFind based discovery system, the aggregated data is also available via both a live search API and an OAI-PMH interface to enable re-use by 3rd party services.

Providing a user login via ORCID accounts

Why you may want to have authenticated users There are a few features of web applications we – as users – grew accustomed to. Like bookmarking things on our favourite marketplaces for instance. For a Specialised Information Service, this feature is especially appealing, since it fits the workflow of a good number of scientists and humanists. You – as a researcher – can search a catalogue and bookmark anything that strikes your interest for further review.

Tips for cleaner, faster and more maintainable XSLT code

Like many other Specialised Information Services (FID), we are working with XSLT to map XML metadata from data providers to our data model, an extended version of the RDF-XML based Europeana Data Model (EDM). In the FID Performing Arts (FID DK), we currently receive data from 22 data providers that deliver 6 different official metadata standards like MARC21, EAD and LIDO as well as 10 individual data standards that result from working with database systems like MS Access or FAUST DB.

Providing BEACON files in discovery systems

When hearing the term beacon, people might think of flares or a lighthouse at first. It can certainly be a guide in the ocean of different authority files on the web. In this post, I will take a closer look at BEACON files, their implementation and why they are a useful addition to discovery systems like the Specialised Information Service Performing Arts (FID DK). Introduction to BEACON Authority data disambiguates and represents controlled entities like persons, corporate bodies, places, topics, works and events via unique identifiers.

CPU-intensive Python Web Backends with asyncio and multiprocessing, Part II

In the first post of this series, I looked at how to achieve parallel execution in Python using multiprocessing and discussed how this is unsuitable with WSGI-based web frameworks because WSGI only allows the web server to create new processes, not the framework. At the end, I mentioned several alternative Python HTTP servers which use asynchronous I/O with an event-loop-based scheduler to handle parallelism. In this post, we will look at how asynchronous I/O works in general, and specifically how it works in Python.