Duplicate Clustering

This page describes how to view and batch link duplicate specimens (specimens of the same taxon collected on the same day by the same person in the same place) using the Duplicate Clustering tool.

Occurrences can be linked as duplicates individually during or after data entry using tools in the occurrence editor. See this page for more information about linking duplicates on an individual basis and this page for information about using the duplicate matching tool during data entry.

Occurrences can also be batch-linked automatically by the Duplicate Clustering tool. This tool creates a temporary index of your occurrences' collection dates, collector numbers, and collector last names, then links any occurrences that share all three of these characteristics.

To view or link duplicates, navigate to your Administration Control Panel (My Profile > Occurrence Management > name of collection) and click Duplicate Clustering.

  • To view existing duplicates, click Specimen duplicate clusters
  • To view duplicates with taxonomic identifications that do not match, click Specimen duplicate clusters with conflicted identifications. An example output of this tool is shown in the screenshot below.
  • To batch link duplicates, click Batch link specimen duplicates. This will automatically run the batch linking script to create duplicate clusters.

When viewing clustered duplicates, you can view the record for any occurrence by clicking the catalog number in blue font.

Example Duplicate Conflicts

Cite this page:

Katie Pearson. Duplicate Clustering. In: Symbiota Support Hub (2021). Symbiota Documentation. https://biokic.github.io/symbiota-docs/coll_manager/dup/. Created on 13 Dec 2021.