Hydropower eLibrary / Dam License and Document Repository

The Hydropower eLibrary is a free online repository for documents and data related to U.S. hydropower projects and infrastructure. It includes all hydropower documents from the FERC eLibrary, a comprehensive list of FERC hydropower projects, and an interactive map of existing U.S. hydro installations.

Live Site

`Problem` FERC Hydropower Licensing Documents are Difficult to Find

The Federal Energy Regulatory Commission (FERC)’s eLibrary is an online records information system containing millions of documents for the four industries FERC regulates: electric, hydropower, natural gas, and oil. These documents are particularly valuable to the hydropower community as they offer current and historical data on environmental studies and licensing conditions for individual hydropower projects.

However, users have reported frustration with finding and accessing relevant documents within the eLibrary due to its poor user interface and confusing data structure. The U.S. Department of Energy's Water Power Technologies Office (WPTO) tasked PNNL with creating a hydropower-focused version of the FERC eLibrary. This new version aims to provide hydropower industry professionals with a better user experience in locating FERC documents and additional hydropower data not included in the original FERC eLibrary.

`User Research & Context` The Original FERC eLibrary

First, I interviewed and observed hydropower professionals using the FERC eLibrary and noted several key issues common to most users:

Project ID Barrier: Finding a hydropower project's P-0000 ID number is a prerequisite for any search related to that project. There is no list of these ids, creating a significant barrier for users before they can even begin a document search.
Inefficient Search Results: A general search yields too many results, while a more targeted search excludes too much.
Irrelevant Industry Data: The categories and filters are designed for compatibility with all FERC-regulated industries, but our users are only interested in hydropower data (not transmission, oil, or natural gas).
Confusing Layout: Results are displayed in a table with a confusing layout. Users often do not understand why a result is returned, how sorting works, or how to navigate to view a result.
Slow File Download Requirement: Reviewing the actual contents of a document requires downloading the original file. These files can be quite large, resulting in slow download times.
Irrelevant Document Types: Users typically search for a small number of document types associated with the licensing process, such as License Issuances, License Amendments, and Environmental Reports. Most other records are obstacles to finding these "Key Documents."
Poor Data Quality: Data quality on the FERC eLibrary is generally poor, with frequent mis-categorizations, inconsistencies in naming conventions, and typos that make keyword searches extremely difficult.

FERC eLibrary Screenshots

The FERC eLibrary Search Page — The **Search Page** didn't have too many issues. The most notible problem was finding a correct the "Docket Number" field vaule to begin search.

The FERC eLibrary Results Table Page — The **Results Page** table was difficult to visually parse, and content was hard to navigate to. Almost every user I observed used ctrl+F to highlight their search term.

`Iteration 0` Initial Designs, Layout, and Prototyping

Low Fi Mockups

I began my design exploration with a large set of low-fidelity mockups to compare different layout options. The top five concepts are shown here. From these, I selected two to advance to high-fidelity prototypes: "2 Column Alt" as the primary option, and "1-2 Column Hybrid" as a backup.

Several sets of low fedility mockup options

2 Column Layout - High Fi Prototypes

Next, I created high-fidelity interactive prototypes to develop my preferred "2 Column Alt" layout.

Iteration 1 mockups showing Docket selection

Iteration 1 mockups showing the Document search page with results and graphs

Iteration 1 mockups showing a map of hydropower dams in the UI

Iteration 1 mockups showing the Document search page as a table

Iteration 1 mockups showing a Document rendered as a pdf

Iteration 1 mockups showing a Document rendered as raw text

Single Page Scrolling Alternate

I also explored the "1-2 Column Hybrid" layout as a backup. Although I liked this more simplified concept, discussions with users and engineers led to the decision to go with the more complex layout.

Iteration 1 mockups showing an alternate layout of a simplified single scroll search page

Switching Themes to Avoid Mis-Crediting Our Client

Initially, I adopted the style of the FERC.gov website by theming Blueprint.js using my Blueprint Styler library. However, after the first round of feedback, the client (WPTO) requested a style change to avoid any brand association that might imply credit to FERC.

`Iteration 1` Proof of Concept

Getting the Data from the FERC eLibrary

As the product owner, I worked closely with our data scientists and engineers to plan the data ingestion pipeline. We discovered that we could use the same API that the FERC eLibrary UI search uses to pull data. We considered two approaches:

`Data Approach 1` Use the FERC eLibrary as Our Database

With this minimal approach, our UI would search the FERC eLibrary via their search API but provide a better search experience by translating certain elements like tags into advanced FERC filter sets on a proxy server.

A graphic of on-demand data retrevial from the FERC eLibary

Pros: No need to host our own data. Dataset is always up-to-date.
Cons: Requires setting up a server proxy to transform data into a different structure. Dependence on another site for everything; if they block us or go offline, we are affected. No display of documents raw text.

`Data Approach 2` Copy and Host Our Own Data

With this more robust and traditional approach, we'd pull, enrich, and save a copy of all relevant data from the FERC eLibrary (except for original files). We opted for this approach to have more control.

A graphic showing the data ingestion and enrichment process

Pros: Ability to enrich ingested data to improve data quality, including parsing raw text from documents. Ability to manage how data is queried. Independence from FERC eLibrary's servers and API.
Cons: Initial need to pull a large volume of data (1.4 million documents). Daily updates required, so data will be one day behind. Costs associated with hosting a large amount of data.

Hydropower Projects Dataset

User interviews revealed the need for a list of FERC Hydropower Projects to accompany and cross-link with the documents dataset. We identified a set of regularly updated Excel sheets published on the Hydropower Licensing Page of the FERC website. I then collaborated with our data engineers to create an ingestion script that pulls these Excel sheets weekly, merges them into a unified schema, and publishes a new Projects dataset.

Parsing Raw Text

One of the primary user requests was the ability to quickly preview document contents without having to download them. Users wanted to scan a snippet of text to decide if the document was worth a deeper look.

Parsing raw text also allowed us to perform better keyword searches, although the extreme volume of text later proved to be a performance issue.

I collaborated with our engineers to devise an approach to parse as much raw text as possible. Despite our efforts, data gaps remained. Some files were image scans requiring OCR, and some raw text was formatted irregularly. We decided that achieving around 80% coverage was acceptable and moved on.

A preview of a the raw text parsed out of a pdf document

Example: Hydropower eLibrary | PUBLIC103012P1904TransmittalLtrNOIPAD.PDF | FERC eLibrary Document: Accession 20121031-5296 (pnnl.gov)

Functional React Prototype

The first iteration concluded with a functional application that successfully displayed all the collected data.

A screenshot of the Iteration 1 Prototype Document Search Page

A screenshot of the Iteration 1 Prototype Document Details Page

A screenshot of the Iteration 1 Prototype Project Search Page

A screenshot of the Iteration 1 Prototype Project Details Page

`Iteration 2` Private Alpha

Project Search Features & Key Document Tagging

In iteration 2, we introduced automatic document tagging for the most important "Key Documents" and a project search feature that allows users to search for projects by name, organization, or waterway, rather than just the P-0000 project ID number. We also experimented with a simplified "minimal input" search homepage, but it tested poorly with users and was removed in iteration 3.

Additionally, I adjusted the app's theme in iteration 2. However, most users didn't like the vibrant orange accent color, leading me to adopt a more neutral style in the next iteration.

A screenshot of the Iteration 2 Homepage

A screenshot of the Iteration 2 Homepage with Project Search

A screenshot of the Iteration 2 Document Search Page

A screenshot of the Iteration 2 Document Details Page

A screenshot of the Iteration 2 Project Search Page

A screenshot of the Iteration 2 Project Details Page

`User Testing` Cutting Features: User Personalization and Saved Filter Features

To inform iteration 2, I conducted observational user testing of the first iteration. Users initially requested the ability to save filter sets and tag specific documents. Implementing these personalization features would require user account management and saving user data. Additionally, the functionality for saving, editing, and overwriting filters and tags turned out to be quite complex.

A screenshot of the Iteration 2 Saved Filters Panel — The **Saved Filters** feature provided a list where complex filters could be saved and returned to later. Numeric highlights indicate when there is new content that matches the saved filter.

A screenshot of the Iteration 2 Collections Tags Panel — The **Collections** feature allowed a user to add custom tags to a document or project to easily find it later. Collections tags could be organized hierarchically.

In iteration 2, we prototyped and user-tested these personalization features. Surprisingly, the filter sets users wanted to save were extremely simple, sometimes just a single filter. When I asked about this, users referenced their difficult experience with entering filters in the original FERC eLibrary tool. Because our search features were much easier to use, the time and effort to enter a search was less than the configuring a and using a saved filter.

Regarding the document tagging feature, users mostly tagged the same Key Document types that we were already auto-tagging during data ingestion. Again, our enhanced data UX made this feature largely unnecessary.

Ultimately, we decided to cut these personalization features. The marginal benefits did not justify the overhead of managing user accounts.

`Iteration 3` Public Beta

Usability and Discoveribility Testing

After releasing the public beta, I conducted a more quantifiable observational user test with 5 participants from the hydropower industry. Each participant was asked to complete a common set of tasks designed to test each major feature of the app. Each feature was graded on a 4 point scale to help prioritize which features and tasks needed more attention.

Test Grading

🟢 Pass Fast - the task was completed in a single or a few attempts
🟨 Pass Slow - the task was completed without a hint after several unsuccessful attempts
🔶 Partial Fail - the user became stuck and required a hint to complete the task
🔻 Fail - user could not complete task even after one or more hints were provided
🕙 Skip - task was not attempted due to time constraints or other restrictions

Task	Feature Tested	Grade
Login/Homepage - Where are you? What can you do? Who made this?	Does the homepage accurately describe the tool and data sources? Would a user understand them?	---
There is a project named “Felt” on the Teton River. Find the most recent “License Application” for that project.	Find a Document	🟢1 🟨2 🔶2 🔻0 🕙0
(2 Part) Please save this document to easily find it later. Then, Could you save this in another way?	'Favorite' document and/or 'Download' document
- Download	Used first 4 times	🟢4 🟨1 🔶0 🔻0 🕙0
- Favorite	Used first 1 time	🟢3 🟨0 🔶1 🔻0 🕙1
New Search	Clear out the current search and start from scratch	---
Find all the Projects in “Montana” “Utah” and “Idaho” - how many are there?	Searching in Projects Dataset, Finding and using "State" filter	🟢2 🟨1 🔶2 🔻0 🕙0
Please find all the projects in the current list that have an "Active License"	Applying more filters, specifically the (License) "Status" Filter	🟢3 🟨1 🔶1 🔻0 🕙0
Please find "Active Licenses" in "Idaho" only	Remove filters	🟢5 🟨0 🔶0 🔻0 🕙0
Find the project with the earliest future expiration date. The next active license to expire (Felt)	Sorting by property, reversing sort	🟢3 🟨1 🔶1 🔻0 🕙0
Find the License Issuance for that project (Felt)	'Search this Project for Documents' button, Document search (again).	🟢3 🟨1 🔶1 🔻0 🕙0
New Search	Clear out the current search and start from scratch	---
Find all NEPA Documents	'Key Documents' filter and data tagging	🟢1 🟨2 🔶2 🔻0 🕙0
Search for all NEPA documents that mention “Salmon”	Keyword search and understanding of why results were returned, dataset trust	🟢1 🟨3 🔶1 🔻0 🕙0
Search for all NEPA documents that mention “Salmon” or "Trout"	Advanced keyword search	🟢0 🟨0 🔶0 🔻4 🕙1
Totals	11 Tasks / 5 Participants	🟢26 🟨12 🔶11 🔻4 🕙2

Interactive Dam Map of Existing Hydropower Assets in the United States.

One of the final and most requested features was an interactive dam map. We sourced data from Oak Ridge National Laboratory's (ORNL) Existing Hydropower Assets (EHA) Database. This data source enabled us to plot the positions of dams nationwide and provided additional metadata for our Projects dataset.

A screenshot of the Iteration 3 Dam Map Page

Dedicated Project Page

We also added a dedicated Project page that includes all the metadata, dams, and Key Documents tagged to that project.

A screenshot of an Iteration 3 Project Details Page

Additional Tweaks and Refinements

Based on user feedback, we implemented numerous feature refinements, including a new homepage, an about page, maps integrated into most pages, improved auto-tagging, added more Projects, enabled CSV export, enhanced cross-linking between Document and Project pages, and many other improvements.

A screenshot of the Iteration 3 Homepage

A screenshot of the Iteration 3 Document Search Page

A screenshot of the Iteration 3 Project Search Page

A screenshot of the Iteration 3 Document Details Page

`Conclusion` A Generally Successful Launch

Overall, this project was a reasonable success. It's currently deployed and receives an stead 150 average monthly users, with an average session duration of 8 minutes, and about 3,000 total monthly page views. Feedback from direct contact has been positive, with most requests asking for more data rather than additional features. WPTO was pleased with the tool and traffic. Although I was underwhelmed by the traffic numbers, our client seemed quite impressed, indicating that the app is extremely helpful to a small but dedicated user base.

You can access the app now at:

HydropowerELibrary.pnnl.gov

`Retrospective` What Could Have Been Better

Static Site Generation

The current site is a React SPA that (slowly) loads a large react bundle and then requests all the data it needs. No content is rendered for several seconds, and Google's crawling capabilities are limited due to the JavaScript resources required to render each page.

In hindsight, using Server-Side Rendering (SSR) or a Static Site Generator (SSG) like Next.js would have been better. Sending pre-rendered html with this approach would improve page loading speed and enhance our chances of Google indexing our project and key document pages, potentially improving organic search traffic for searches like "Vernon Hydropower License."

Keyword Search Coverage and Relevance Sorting

Our keyword search turned out to be too slow to search all our document raw text. We had to limit the search scope to only include the full raw text of Key Documents, while other documents only had the first few thousand characters available to keyword search. This limitation was a significant hit to our UX since keyword search is the most common method of filtering.

We also attempted to sort results based on "Relevance," which is a pretty fuzzy criteria. We developed rudimentary relevance scoring based on keyword frequency and location, but it did not perform particularly well.

(Database and server architecture is bit beyond my depth, but...) Utilizing other databases or search engines like ElasticSearch could vastly improve search coverage, speed, and sorting. However, supporting this along with Static Site Generation would require a significant rewrite of our current infrastructure.

URL Search Parameters for Filter Sets

Our filters are not saved or restored from URL search parameters. Adding this feature would have allowed users to keep their "Saved Filters" as a URL without requiring us to support user accounts.

Mobile Design

The initial scope of this project deprioritized a responsive mobile layout to cut development time. As the project progressed, our frontend architecture proved prohibitively difficult to retrofit with a mobile layer. If mobile design had been considered from the beginning, it would no have been difficult to maintain long-term.

Thanks for reading. Return Home to continue.

James Bradford / UX Engineer

Hydropower eLibrary / Dam License and Document Repository

`Problem` FERC Hydropower Licensing Documents are Difficult to Find