This site attempts to protect users against Cross-Site Request Forgeries attacks. In order to do so, you must have JavaScript enabled in your web browser otherwise this site will fail to work correctly for you. See details of your web browser for how to enable JavaScript. Skip to Main Content Library - University of Liverpool
Toggle mobile navigation

Researcher KnowHow: Web-based Research Tools

On this page, we list some of the web-based tools available for researchers. 

Unless indicated, if you need or would like further support on an application you should go online to the community and/or company concerned. The University does not have an institutional subscription to all these tools. If you wish to purchase a subscription for your research project, please check with IT Services first. 

Digital Scholar Lab

The Gale Digital Scholar Lab is a cloud-based research environment that allows students and researchers to apply natural language processing tools to raw text data (OCR) from Gale's Primary Sources in a single research platform.

What Gale says

When performing analyses, finding, cleaning, and organising data, natural language processing (NLP) for historical texts is often a daunting task, especially when looking to generate meaningful results. Gale Digital Scholar Lab  removes these barriers and streamlines the workflow process, allowing researchers to spend more time identifying previously undiscovered data, testing theories, analysing results, and gaining new insights. 


You can access the Digital Scholar Lab via the Library’s list of databases, which can be found on the Library Main Page under the Discover Search Bar. Choose Research Tools in the Database type section, it is listed alphabetically.  You will have to sign in to get access to the tool and have a google account to create an account with Gale.

Data Analysis

The web is a valuable tool to help with understanding your data analysis problems whether you are a novice or an expert

For those just starting or with some basic skills, there are the following.

LinkedIn Learning

The University has a license for LinkedIn Learning. This provides video-based training for a wide range of topics. Under the topic of data analysis, there are 195 courses listed. These range from relatively short introductions to the topic, such as Learning Data Analytics to multi-session courses with 30+ hours of content. Some are self-contained, whilst others are best appreciated if you install the suggested application software and do exercises as you progress through the course.

The Programming Historian

A popular text-based interactive site is The Programming Historian. As of August 2021, there are 86 lessons in English as well as a good range of courses in French, Spanish and Portuguese. These vary in difficulty with a range of topics covered. Several of these address data analysis needs relevant to the Humanities.


OpenRefine is an Open Source tool that helps you explore, clean data and transform it from one format to another. It also allows your data to be augmented with data from other sources such as websites and web services.

OpenRefine is a Java-based tool that runs via a small server on your local computer, this ensures privacy and means your data never leaves your system.

Library Carpentry provides a useful introductory lesson, which includes sample data, in OpenRefine.


Amnesia is a tool to transform personal data to anonymous data, which can then be used free of restrictions imposed by GDPR. It facilitates the replacement of unique values, or combinations of values, with more abstract ones. These abstractions can be saved and reapplied on similar data in the future. Amnesia can produce several solutions to fully anonymise a dataset, allowing the user to choose the most appropriate to the potential research application.

Amnesia is Open Source and can be used under a 3-Clause BSD License.


SketchEngine is an online language analysis tool. It allows users to study large collections of language use, known as corpora, mainly collected from different websites, to reveal insights into language, information, content, and communication. The corpora on SketchEngine each contain several billions of words of English, Romance languages, Arabic, Chinese, and so on. Data is tagged with information such as part of speech (noun, verb, etc) and syntactic position (subject, object, modifier, etc). Searches can reveal patterns and meanings of language employed by many speakers and writers or can contest or elucidate received wisdom about language use. SketchEngine users can also upload their own corpora for analysis.

Access SketchEngine using the institutional login, using University credentials, with two-step identification.


Omeka is a web publishing platform for sharing digital collections and creating online exhibitions. It’s widely used in the cultural heritage sector, but has wider applications for a range of humanities and arts research data, especially if you are interested in showcasing collections and sharing data with the public. The free, web-hosted version allows you to create a single site where you can upload images, audio, video, etc and apply rich, descriptive metadata using Dublin Core. If you need more sites and more storage, a range of pricing tiers are available. The open source, downloadable version of Omeka is even more powerful and versatile, although it may require some institutional IT support to run and maintain.

An example of using Omeka to share research data (including crowd-sourced data): Hermoupolis Digital Heritage Management (Hermes).

 customer service excellence