Weaviate is a free, open-source, GraphQL, and RESTful API-enabled knowledge graph that allows you to leverage data linkage and automatic classification. Weaviate leverages the power of machine learning and vector databases to make information from different sources easily accessible and to connect the dots between your data objects.
In Weaviate, an object refers to a central data unit in the Weaviate ecosystem, representing an instance of a specific schema class. It is mainly comprised of four main components:
- Id – This is a unique identifier for the data object, typically a UUID value.
- Class – the class defines the schema to which the data object belongs. The class will determine the properties of the object.
- Properties – the properties part of the object stores the actual object data. Properties correspond to the properties defined in the schema class and can be of various supported data types such as text, numeric, date, Boolean, etc.
- Meta – This component includes metadata about the object, such as the last update time, creation time, vector representation, etc.
We will learn how to build a basic data object in Weaviate and finally learn the various methods of listing data objects in the Weaviate instance.
Weaviate Create Data Object
The first step is to set up a basic data object for demonstration.
The first step is configuring the schema that represents the data structure we wish to store. The schema itself is comprised of classes that represent the type of objects we wish to create.
Suppose we wish to create a schema that stores database information. We can start by creating a class called DatabaseServers with the properties as shown:
- Name – this property specifies the name of the database server.
- Type – defines the type of database, e.g. MySQL, PostgreSQL, MongoDB, etc.
- Version – we can also have a property that stores the version of the database server running.
- Port – Finally, we have the port property, which stores information about the port on which the server is running.
Once we have defined the schema class, we can create data objects representing instances of DatabaseServer.
We can use the Weaviate SDK for the Python programming language to create such as class, as shown in the example code below:
import weaviate
client = weaviate.Client("http://localhost:8080 ")
class_obj = {
"class": "DatabaseServer",
"description": "Information about a database server",
"properties": [
{
"dataType": ["string"],
"description": "Name of the database server",
"name": "name",
},
{
"dataType": ["string"],
"description": "Type of the database server (MySQL, PostgreSQL, MongoDB, etc.)",
"name": "type",
},
{
"dataType": ["string"],
"description": "Version of the database server",
"name": "version",
},
{
"dataType": ["string"],
"description": "Host of the database server",
"name": "host",
},
{
"dataType": ["int"],
"description": "Port of the database server",
"name": "port",
},
],
}
client.schema.create_class(class_obj)
data_obj = {
"name": "Primary Database",
"type": "MySQL",
"version": "8.0.23",
"host": "192.168.1.100",
"port": 3306,
}
data_uuid = client.data_object.create(
data_obj,
"DatabaseServer",
consistency_level=weaviate.data.replication.ConsistencyLevel.ALL,
)
This should create the defined schema class and all the defined objects.
Weaviate List Data Objects
There are various methods of fetching the data objects. The first and most method is by using the API endpoint.
Using API Endpoint.
We can simply send a GET request to the /v1/objects
endpoints to retrieve the data objects in the Weaviat instance.
NOTE: Requesting the endpoint above removes any restrictions. Hence, the request will return all the data objects across all the classes. However, it has a default limit of 25.
You can also perform a more granular filtering as shown in the example syntax below:
GET /v1/objects?class={ClassName}&limit={limit}&include={include}
This allows you to specify which class you wish to target and the limit of the data objects you wish the request to return.
You can also use the offset parameter to perform paging. The offset parameter defines at which position you wish to start fetching the data objects.
For example, to fetch the first 10 data objects, you can run the request as:
GET /v1/objects?class=MyClass&limit=10
To fetch the next 10:
GET /v1/objects?class=MyClass&limit=10&offset=10
Similarly, to fetch the next 10 topics after that:
GET /v1/objects?class=MyClass&limit=10&offset=20
For example, to fetch the data objects from the DatabaServer class we created in the previous example, we can run a query as shown:
curl -X GET "http://localhost:8080 /v1/objects?class={DatabaseServer}&limit={10}" | jq
The above command performs a request to the /v1/objects endpoint to fetch the first 10 data objects from the DatabaseServer class. We also pass the output to JQ to format the output more readably.
The query returns the data objects with detailed information such as the source Class, creation time as a UNIX timestamp, last update time, total results, and more.
Conclusion
In this tutorial, we explored the fundamentals of working with the /v1/objects API endpoint in Weaviate to gather information about the data objects of a given class.