Method

To retrieve vectors from the index based on specific criteria, you can use the query method, which accepts the following parameters:

  • vector: The reference vector for similarity comparison.
  • include_metadata: A boolean flag indicating whether to include metadata in the query results.
  • include_vector: A boolean flag indicating whether to include vectors in the query results.
  • include_data: A boolean flag indicating whether to include data in the query results.
  • top_k: The number of top matching vectors to retrieve.
  • filter: Metadata filtering of the vector is used to query your data based on the filters and narrow down the query results.

As response, the object has the following fields:

  • id: The identifier associated with the matching vector.
  • metadata: Additional information or attributes linked to the matching vector.
  • score: A measure of similarity indicating how closely the vector matches the query vector. The score is normalized to the range [0, 1], where 1 indicates a perfect match.
  • vector: The vector itself (included only if include_vector is set to True).
  • data: Additional unstructured information linked to the matching vector.
If you wanna learn more about filtering check: Metadata Filtering

Query Example

import random

from upstash_vector import Index

index = Index.from_env()

# Generate a random vector for similarity comparison
dimension = 128  # Adjust based on your index's dimension
query_vector = [random.random() for _ in range(dimension)]

# Execute the query
query_result = index.query(
    vector=query_vector,
    include_metadata=True,
    include_data=True,
    include_vectors=False,
    top_k=5,
    filter="genre = 'fantasy' and title = 'Lord of the Rings'",
)

# Print the query result
for result in query_result:
    print("Score:", result.score)
    print("ID:", result.id)
    print("Vector:", result.vector)
    print("Metadata:", result.metadata)
    print("Data:", result.data)

Batch Query Method

It is also possible to perform a batch of queries in a single call to eliminate round trips to server.

Batch Query Example

import random

from upstash_vector import Index

index = Index.from_env()

# Generate a random vector for similarity comparison
dimension = 128  # Adjust based on your index's dimension
query_vectors = [[random.random() for _ in range(dimension)] for _ in range(2)]

# Execute the query
query_results = index.query_many(
    queries=[
        {
            "vector": query_vectors[0],
            "include_metadata": True,
            "include_data": True,
            "include_vectors": False,
            "top_k": 5,
            "filter": "genre = 'fantasy' and title = 'Lord of the Rings'",
        },
        {
            "vector": query_vectors[1],
            "include_metadata": False,
            "include_data": False,
            "include_vectors": True,
            "top_k": 3,
            "filter": "genre = 'drama'",
        },
    ]
)

for i, query_result in enumerate(query_results):
    print(f"Query-{i} result:")

    # Print the query result
    for result in query_result:
        print("Score:", result.score)
        print("ID:", result.id)
        print("Vector:", result.vector)
        print("Metadata:", result.metadata)
        print("Data:", result.data)

Also, you can specify a namespace to operate on. When no namespace is provided, the default namespace will be used.

index.query(..., namespace="ns")