The search panel search box supports simple and complex search criteria via the Lucene format. Basic and advanced example searches are outlined below. Search will return all datasets that are internal (i.e. visible to all) and the searching user’s own private datasets. For more information about dataset-level visibility, see the ‘Dataset Visibility’ article.
Indexing
When you hit enter, your search query is executed against the data held in the search index and returned as results via the UI or API. Currently, the following dataset information is indexed:
- Dataset catalogues information compliant with the Data Catalog Vocabulary (DCAT) standard.
- Dataset dictionary information including table names, field names, field labels, field types and field descriptions.
- Any controlled vocabularies per dictionary. When datasets are created, updated or deleted, the search index is automatically updated to reflect these changes.
Basic Search Examples
These searches are typical of those that you might enter into any search engine, e.g. Google.
Simple
Searching for datasets that contain term(s). Example:
age
Exact Phrase
Searching for datasets that contain the exact phrase. Example:
“Alzheimer’s Dementia”
OR
Searching for datasets that contain one or more of the specified terms. Examples:
age sex
age | sex
AND
Searching for datasets that contain all specified terms. Example:
age +sex
NOT
Searching for datasets than contain one term but not others. Example:
sex -age
Wildcard
Searching for datasets than start with the specified criteria
alzheim*
Advanced Search Examples
These searches show the power of utilising the Lucene query format to express complex queries to obtain the results required.
Fuzzy Matching
Search for dataset that contain or resemble the specified terms. Example:
age~
Fielded Search
Searching for datasets that contain the term in a specific section of the dataset metadata.
catalogue__description:age
Further information about fields to search can be found in the API documentation.
Mix and Match
Searching for datasets that contain multiple syntax operations. Example:
(age+(sex|group)) -"Alzheimer's Dementia"