from pyseter.experimental import launch_review
from pyseter.identify import predict_ids
from pyseter.sort import load_features
import pandas as pd
data_dir = '/Users/PattonP/datasets/happywhale/'
feature_dir = data_dir + '/features'
reference_path = feature_dir + '/train_features.npy'
reference_files, reference_features = load_features(reference_path)
query_path = feature_dir + '/test_features.npy'
query_files, query_features = load_features(query_path)
id_df = pd.read_csv(data_dir + '/train.csv')AnyDorsal ID app
Pyseter ships with an experimental app that allows users to click through proposed IDs, identify matches, and export the results of the identification via a .csv. It’s worth reiterating that this is an experimental feature and, as such, is pretty bare bones. Nevertheless, we hope people will find it useful and suggest ways it could be improved.
This notebook will demonstrate how to run the app locally, and show users a demo version of the app. Running the code below will launch a version of the app based on the Happywhale data, where the test images are the “query set” and the training images are the “reference set”. The demo version is hosted on Hugging Face, and represents a subset of the Happywhale dataset.
Launching the app locally
Here we’ll assume that you’ve already extracted the features for the query images and the reference images. As such, we can just load them in.
Then we’ll create dictionaries for the reference set and the query set. These dictionaries map the file names to the feature vectors. Once we’ve done that, we can predict the 5 closest IDs in the query set to that of the reference set.
query_dict = dict(zip(query_files, query_features))
reference_dict = dict(zip(reference_files, reference_features))
prediction_df = predict_ids(reference_dict, query_dict, id_df, proposed_id_count=5)Now we have all we need to launch the app: the data frame containing the predictions, prediction_df; the data frame containing the IDs for images in the reference set, id_df; the directory containing the query images; and the directory containing the test images.
launch_review(
prediction_df,
id_df,
data_dir + '/test_images'
data_dir + '/train_images'
)Demo version
Here, we demonstrate what the app looks like when you run it locally. This demo version contains 68 images from the Happywhale dataset, and represents a roughly even sample of every catalog in the dataset.
Here are a few pointers on how to use the app
- Next and Prev navigate between query images
- You can navigate between reference images of proposed IDs with the left and right arrow keys
- The radio buttons under Select correct match allow you to select which proposed ID best matches the query image, if any
- Confirm match locks in your choice
- Download csv downloads a .csv where one column is the query image and the other column is the confirmed ID
- Clicking the X in the top right corner brings up a grid of images. Clicking an image on the grid returns to the single image viewer.