Single Sign-On with Django and Keycloak

2024-07-17

1251 words 6 mins read

Introduction

User management is a fundamental aspect of web application development. While a local login is the simplest method to grant users access to resources, it comes with certain limitations. If users want to access multiple services using the same credentials, a centralized authentication system is necessary.

For a diverse user base (spanning different contexts such as industry specialists, academia, and various organizations or locations) a Single Sign-On (SSO) solution is essential. Well-known identity providers (IdPs) like GitHub¹ in the tech world, ORCID² for the scientific community, and the lesser-known DNF-AAI³ in German academia enable users to reuse their credentials across services.

Data Engineering with luigi - Lessons learned

2023-08-17

2013 words 10 mins read

Introduction

At the UB JCS, we make extensive usage of the Python luigi framework for data engineering. The framework is capable of handling thousands of tasks, calculating non-circular task dependencies, and run over days. Additionally, it provides a convenient web control panel to see, e.g. the task dependencies in a tree diagram or start specific tasks.

Although luigi itself supports the user already by enforcing a very specific structure, there are still some things to consider when designing a data pipeline with luigi (for a general introduction, see in a previous post). In this post, I present ideas, that I learned while using luigi. Since luigi is a heavily object-oriented framework, some approaches in this post rely naturally on Software architecture patterns.

Common engineering strategies in luigi

2023-06-29

2097 words 10 mins read

Introduction

For many automated data processing tasks within the context of the Specialised Information Services (FID) at the University Library Frankfurt, we use the Python package luigi. This package proves especially useful when a task (e.g. the loading of data into a database) depends on the work of other tasks that have to run successfully, before the next task starts (e.g. first you need to download the data). luigi orchestrates all required tasks and their respective required task(s) and then processes everything for you. This approach makes the maintenance of tasks very easy, since you only have to add or remove required tasks from any task and luigi handles the rest for you, while you don’t have to worry about the computer science behind it too much. But although luigi takes a lot of mental load off of you, it also requires strategies to handle common situations that you may find yourself in.

Using another metadata standard than MARC21 in VuFind, Part II

2023-03-31

1518 words 8 mins read

Introduction

In the first post of this series, we covered the necessary steps to populate VuFind’s Solr cores with title and authority records. This second post describes changes to configuration files, as well as modifications that are necessary to interact, display and export records. All our customizations are based on existing VuFind code and, to ensure maintainability, stored in the local/ folder and the custom module Fiddk. We want to remind the reader, that we assume basic VuFind knowledge, which can be acquired from the documentation.

Using another metadata standard than MARC21 in VuFind, Part I

2023-02-06

1682 words 8 mins read

Introduction

When you install the open source discovery system VuFind, follow basic configuration steps and feed it with library records, it works well out-of-the-box and provides you with faceted search results and the possibility to browse through your data besides many other features. The easiest way to achieve this, is to load the standard interchange format for library records, i.e. MARC21, into the included Apache Solr-based search index. The FID Performing Arts uses VuFind since 2015, but as mentioned in an earlier post, we receive a vast amount of metadata from performing arts museums and archives in standards other than MARC21, such as EAD, METS/MODS and LIDO as well as other individual data formats which result from database systems like MS Access or FAUST DB. In order to meet the special requirements of performing arts metadata and work with a consistent data set during data aggregation, we decided to map all received data into an extended version of the universal and flexible Europeana Data Model (EDM).