SF Nexus is an open access project. Data and resources provided here are free for everyone, including:

Extracted Features: Disaggregated feature sets from copyrighted literature, available for research purposes
Python Notebooks: Custom Jupyter notebooks in Google colab environments for easy exploration of our data
Documentation: Descriptions of pipelines used to digitize and analyze our dataset, from OCR cleaning to topic modeling and visualization
Visualizations: Output generated from analyses of our dataset, including topic modeling and word embeddings

Overviewing the SF Nexus

The SF Nexus comprises a collaborative network of research and public libraries with collections of SF, dedicated to making science fiction available online, including as data. While the SF Nexus project is based at Temple University’s Charles Library, we are committed to growing our collaborations with a SF-focused collective research community. This project presents a prototype of what could be developed as a large-scale collaborative digitization between the dozens of science fiction collections across England and North America, including but not limited to the members of the Science Fiction Collecting Libraries Consortium

The current phase of this website showcases a demonstration project of how libraries can digitize and make available their copyrighted cultural collections as data. Our current focus has been on sharing extracted features of the data, as well as documenting the corpus’ ingestion and curation in the HathiTrust Research Center. Additional projects at Temple Libraries involve developing localized data capsules for confidential computing access to copyrighted corpora, as well as novel ways of digitizing corpora under controlled circumstances.

Explore the Project

About — the project’s history, the Paskow Science Fiction Collection, and how the corpus was digitized and ingested into HathiTrust
Data — freely available extracted-features datasets drawn from our 403-text corpus
OCR and Models — our digitization pipeline and topic-modeling analyses
Scholarship — related projects, datasets, and digital archives of science fiction as data

Ultimately, the SF Nexus seeks to build and share a comprehensive dataset of science fiction literature. Due to limitations imposed on copyright, this project explores speculative approaches to data curation that can make elements of each book (extracted features) available to scholars seeking to engage in large scale analysis of text as data.

For an overview of our approach, see Alex Wermer-Colan’s and James Kopaczewski’s article, “The New Wave of Digital Collections: Speculating on the Future of Library Curation” (2022).

The SF Nexus

A hub for science fiction collections as data

Overviewing the SF Nexus

Explore the Project

Recent Models