AnyDorsal ID app

Pyseter ships with an experimental app that allows users to click through proposed IDs, identify matches, and export the results of the identification via a .csv. It’s worth reiterating that this is an experimental feature and, as such, is pretty bare bones. Nevertheless, we hope people will find it useful and suggest ways it could be improved.

This notebook will demonstrate how to run the app locally, and show users a demo version of the app. Running the code below will launch a version of the app based on the Happywhale data, where the test images are the “query set” and the training images are the “reference set”.

You can also check out the demo version Hugging Face. The demo version contains a subset of the Happywhale dataset.

Launching the app locally

Here we’ll assume that you’ve already extracted the features for the query images and the reference images. As such, we can just load them in.

from pyseter.experimental import launch_review
from pyseter.identify import predict_ids
from pyseter.sort import load_features
import pandas as pd

data_dir = '/Users/PattonP/datasets/happywhale/'

feature_dir = data_dir + '/features'

reference_path = feature_dir + '/train_features.npy'
reference_files, reference_features = load_features(reference_path)

query_path = feature_dir + '/test_features.npy'
query_files, query_features = load_features(query_path)

id_df = pd.read_csv(data_dir + '/train.csv')

Then we’ll create dictionaries for the reference set and the query set. These dictionaries map the file names to the feature vectors. Once we’ve done that, we can predict the 5 closest IDs in the query set to that of the reference set.

query_dict = dict(zip(query_files, query_features))
reference_dict = dict(zip(reference_files, reference_features))

prediction_df = predict_ids(reference_dict, query_dict, id_df, proposed_id_count=5)

Now we have all we need to launch the app: the data frame containing the predictions, prediction_df; the data frame containing the IDs for images in the reference set, id_df; the directory containing the query images; and the directory containing the test images.

launch_review(
    prediction_df,
    id_df,
    data_dir + '/test_images'
    data_dir + '/train_images'
)

Demo version

Here, we demonstrate what the app looks like when you run it locally. This demo version contains 68 images from the Happywhale dataset, and represents a roughly even sample of every catalog in the dataset. You can also check out the demo version Hugging Face.

Here are a few pointers on how to use the app

Next and Prev navigate between query images
You can navigate between reference images of proposed IDs with the left and right arrow keys
The radio buttons under Select correct match allow you to select which proposed ID best matches the query image, if any
Confirm match locks in your choice
Download csv downloads a .csv where one column is the query image and the other column is the confirmed ID
Clicking the X in the top right corner brings up a grid of images. Clicking an image on the grid returns to the single image viewer.