In this presentation, Hugues Cazeaux, head of the IT department at the University of Geneva with over 25 years of experience, addresses the issue of digital preservation within the institution.
To structure his presentation, he proposes three main parts that allow for a progressive exploration of the issues and solutions implemented at the University of Geneva.
1. The DLCM project: a progressive approach to archiving
The first part is devoted to the DLCM (Data Life Cycle Management) project, a national programme funded by Swiss Universities and deployed over six years in three successive phases:
- Phase 1 (2015-2018): Evaluation of open source solutions adapted to archiving and data management needs.
- Phase 2 (2018-2020): Development and implementation of DLCM solutions, leading to the creation of a new service.
- Phase 3 (2021): Launch of OLOS, a SaaS solution dedicated to small institutions without an archiving service, offering them an efficient way to store and structure their data.
2. The impact of legislative developments on archiving
The second part of the presentation deals with the draft laws that have influenced storage infrastructures and associated services. The objective was to optimise access to and use of research data within the Geneva universities. This project, approved in 2017, was rolled out over a period of seven years.
3. Yareta: a digital archiving solution
A significant part of the presentation is dedicated to Yareta, a digital repository based on open source technologies and implemented in 2019. Today, it has more than 1,000 archives created and more than 15 TB of data stored.
The development of Yareta has been based on the progress of the DLCM project and funding from the Swiss National Science Foundation (SNSF). In March 2024, the solution obtained CTS (Core Trust Seal) certification, guaranteeing its compliance with best practices in digital archiving.
Yareta’s strategy is based on maximum accessibility, with the motto:
‘As open as possible, as closed as necessary’, in favour of the open access principle.
4. Administrative and heritage archiving: meeting specific needs
Digital archiving is not just about research. Work has been carried out to identify and meet the conservation needs of administrative and heritage documents. The evaluation focused on several criteria:
- The archiving needs of the Rectorat
- The digitisation of academic archives
- The conservation of digitised documents
- The modernisation of archival collection cataloguing systems
5. Hedera: structuring data for the digital humanities
Launched in 2024, Hedera is part of an approach aimed at standardising and homogenising data related to the digital humanities.
This project is based on a structured process in several stages:
- Transformation of metadata into RDF (Resource Description Framework).
- Consolidation of data thanks to a common and shared ontology.
The objective is to make data and metadata more accessible to researchers and users, by facilitating their interoperability and exploitation.
The main challenge lies in the introduction of a new archiving format, intended to replace the old Isad(G) standard, in order to improve the structuring and sustainability of digital archives.
6. DNAMIC: digital archiving via DNA
Finally, the last part of the presentation focuses on the DNAMIC project, an innovative initiative that explores data archiving in DNA.
The idea is to develop a prototype to automate the encoding, storage and decoding of archives using a micro-factory and the advances of the DLCM project.
Why use DNA as a storage medium?
- It offers a high density of information in a small space.
- It can be stored at room temperature, thus reducing energy costs.
- It allows for replication capacity, ensuring long-term data preservation.
Through these initiatives, the University of Geneva is demonstrating its commitment to digital preservation, relying on innovative solutions tailored to the needs of research, administration and academic heritage.
In this presentation, Marion Ville, from the Vitam programme, explains the tools used within the French public archives network to facilitate the collection and archiving of office documents.
1 – The context of public archives in France
In France, the archiving of documents produced by the public administration is overseen by a structured network:
- The Comité Interministériel des Archives de France (Interministerial Committee for the Archives of France) defines the main strategic orientations for public archives.
- The Interministerial Service of the Archives of France (SIAF), attached to the Ministry of Culture, coordinates and evaluates the State’s archiving activities, with the exception of the archives of the Ministry of Europe and Foreign Affairs and those of the army, which have their own autonomy.
2 – The Vitam programme
The Vitam programme is an electronic archiving system funded by several ministries (Culture, Foreign Affairs, Army). It aims to structure and modernise the management of public archives by guaranteeing their durability and accessibility.
3 – The challenges of electronic archiving
Public archive services face two major challenges:
- Data processing: It is essential to archive and quickly retrieve documents in a growing volume of information.
- The submission package: The archived documents must comply with the SEDA (Standard d’Échange de Données pour l’Archivage) standard and be integrated in XML SEDA format.
4 – The archiving tools presented
Archifiltre-Docs
Developed in 2019 by the Archives Mission of the social ministries, this tool allows you to visualise and organise file trees. It helps to sort, describe, enrich and identify documents that can be eliminated before their final archiving.
ReSIP
Created by the Vitam programme and available in Open Source, ReSIP allows you to import, process and export SIPs (information packages subject to archiving), thus offering an advanced solution compared to the old tools that were mainly used to perform tests.
Octave
Developed by the Service Interministériel des Archives de France (Interministerial Archives Department of France), Octave facilitates the creation of office archive submission packages. It is specifically designed to meet the needs of local public archives.
5 – Demonstration and conclusion
The presentation ends with a demonstration of the three tools in action, illustrating how they enable the collection, organisation and integration of public archives.
In conclusion, solutions such as Archifiltre-Docs, ReSIP, Octave and other electronic archiving tools enable archivists to meet regulatory requirements while complying with current standards. These solutions offer different approaches adapted to the needs of archiving professionals.
A genesis in regulatory archiving
Arcsys was created in the 1990s to meet specific tax requirements, in particular the control of accounting data. In 2004, the project took a decisive turn with a broader ambition: to provide a digital preservation solution covering both regulatory (invoices, contracts) and heritage (theses, nautical charts) needs. This dual purpose, combining legal compliance and data enhancement, marked the beginning of an adventure that continues to reinvent itself.
Constant challenges in a changing environment
Interoperability quickly became a major issue. Arcsys had to adapt to a wide range of technologies and standards, while limiting dependencies to guarantee its sustainability. For example, the software integrated various storage systems (disks, tapes, cloud) and sophisticated security tools, while maintaining its compatibility with external solutions such as SharePoint or SAP.
Technological developments have sometimes forced the team to abandon certain software bricks that have become obsolete, such as Solaris or TSM. These adjustments required constant arbitration to remain aligned with customer needs and market standards.
The challenge of performance and volume
Faced with exponentially growing archive volumes, the software has evolved to support ever-increasing loads. Performance requirements, such as the management of tens of millions of documents per day or large-scale consultations, have led to a rethink of the product architecture. A modular approach, based on repositories, has made it possible to distribute the workload while guaranteeing fluidity and robustness. However, each added feature, such as compression or format validation, had to be carefully optimised so as not to compromise overall performance.
Security and durability: top priorities
Data security is a core concern of the product, particularly in sensitive sectors such as the military and aviation. Advanced electronic signature and sealing mechanisms guarantee the integrity of the archives. However, the growing threats to cybersecurity, illustrated by incidents such as Log4Shell, have necessitated the implementation of rigorous action plans. These protocols include continuous monitoring of vulnerabilities and rapid fixes to protect user data.
Sustainability is also a fundamental pillar. The software strives to ensure a smooth migration of data through technological developments, while ensuring total transparency for customers. Data reversibility is another key aspect, allowing users to retain their autonomy, even in the event of a failure by the publisher.
Towards a sustainable and innovative future
While digital preservation continues to grow in importance, the software is moving towards developments that combine innovation and restraint. The aim is to encourage rational data management by eliminating unnecessary archives to limit the carbon footprint. The product’s software factory will also be strengthened to speed up updates and respond effectively to user expectations.
Innovation will remain cautious, with special attention paid to emerging technologies such as blockchain and artificial intelligence. These tools will only be integrated after their added value and sustainability have been guaranteed.
Necessary collaboration between public and private actors
In conclusion, there is a real need to establish closer collaboration between public, private and academic stakeholders to advance digital preservation. No system can function in isolation; it is essential to approve the exchange and sharing of knowledge to meet the complex challenges in this field. Arcsys perfectly represents this dynamic, remaining an accessible solution that is constantly adapting to meet the challenges of tomorrow.
At Smals, a Belgian organisation specialising in IT services for the public sector, digital archiving has become a key solution for meeting the growing needs of public institutions. This presentation details the implementation of a shared platform, the challenges encountered, and the technical solutions chosen to manage a growing volume of digital data.
Background to the implementation of the platform
The project began with the National Social Security Office (NSSO), a central institution in Belgium responsible for managing electronic social security declarations. These files, generated in XML format, represent 100 million exchanges per year, with a growth of 10% each year.
Initially, the aim was to reduce logistical costs while ensuring the legal security of declarations. However, the massive accumulation of data since 2003 has forced a rethink towards a more robust and centralised archiving solution.
Transition to a shared platform
In 2015, Smals began transforming the initial project, intended for a single institution, into a platform shared between several public institutions. This model is based on a single infrastructure and information silos to preserve data confidentiality.
- In 2017, the project went into production with the NSSO as its first client.
- Other institutions are gradually being integrated, with currently 6 institutions connected and 5 others in the process of integration.
Today, the platform manages 480 million archived objects for a total of 81 TB of data, and is continuing to evolve.
Main features of the platform
- Proven and secure archiving: digital documents, whether PDFs, videos or other formats, are archived according to rigorous standards.
- Multiclient and centralisation: a single shared but secure infrastructure is open to federal and regional public institutions.
- Online consultation: The archives are directly accessible via public portals or business applications. For example, Belgian citizens can consult their pension forecasts or administrative documents online.
This direct consultation functionality has enhanced the value of the archives, making them indispensable to business processes while reducing the costs associated with duplicating files in other systems.
Technical challenges encountered
- Metadata management:
- The definition and use of metadata must balance their usefulness and the cost they incur.
- An initially ambitious client simplified its model from 30 to 15 generic metadata to reduce complexity.
- Storage migration:
- Regular replacement of storage media (every 5-6 years) involves cumbersome and costly migrations.
- The last migration lasted 18 months, posing challenges in terms of planning and resources.
- Volume and performance:
- Identifying formats with PRONOM caused significant slowdowns in production, especially for compressed or large files.
- Functionality testing on large volumes remains complex and expensive.
Impact and benefits
- Optimisation of costs: By reducing the need for redundant storage in business systems, the platform is recognised as an energy-saving solution.
- Support for business processes: Archives are no longer seen as a simple legal cost, but as a valued resource.
- Continuous evolution: The system is designed to adapt progressively to growing needs, while retaining the flexibility to integrate new institutions.
Conclusion of the presentation
The shared platform developed by Smals shows how an electronic archiving solution can meet the varied needs of public institutions while enhancing the archives for operational uses. Thanks to a modular and scalable approach, Smals has succeeded in reconciling legal requirements, technical constraints and economic ambitions, thus offering a sustainable and adaptable solution for the Belgian public sector.