Guides

3 Best Practices For Using AI To Query Your Database

Discover three crucial best practices for using AI to query your database, including semantic search, reference guides, and schema injection. Learn how Locusive simplifies the implementation of these best practices, empowering your team to make data-driven decisions faster and more efficiently.
Shanif Dhanani
16.9 minutes

Introduction

AI-powered database querying is a powerful tool that can help you get answers from your data using natural language instead of complex SQL queries. This can be a huge time-saver, especially for non-technical folks who need access to data but don't have the skills to write SQL.

But just like with any tool, there are right ways and wrong ways to use AI for database querying. If you don't follow best practices, you might end up with inaccurate results, slow performance, or even security issues.

In this post, we'll share three best practices that we've found to be essential for getting the most out of AI-powered database querying:

  1. Provide relevant example queries and use semantic search to find the most relevant ones
  2. Create a reference guide that explains the structure and nuances of your database
  3. Always inject your database schema and its description into the prompt

By following these best practices, you can ensure that your AI-powered database querying is accurate, efficient, and secure. If you're interested in a more comprehensive guide, we mention these three best practices as part of our larger comprehensive guide for querying your database with AI. For now, let's dive in and take a closer look at each of the best practices mentioned above.

Best Practice 1: Provide Relevant Example Queries

A screenshot of how Locusive lets you input example queries

One of the most effective ways to help AI understand how to query your database is by providing example queries. Example queries serve as a guide for the AI, showing it how to structure queries and retrieve data based on different user intents and use cases.

When you provide a diverse range of example queries, you're essentially teaching the AI how to handle various scenarios. For instance, you might include examples of how to:

  • Filter data based on specific conditions
  • Aggregate data to calculate totals, averages, or counts
  • Join multiple tables to retrieve related data
  • Sort and limit results based on certain criteria

However, just providing a bunch of example queries isn't enough. You also need to make sure that the AI can find the most relevant examples based on the user's input. That's where semantic search comes in.

Semantic search is a technique that uses natural language processing to understand the meaning and context behind a user's query. Instead of just matching keywords, semantic search looks at the intent behind the words to find the most relevant results.

By using semantic search to match user queries with your example queries, you can ensure that the AI is always injecting the most relevant examples into the prompt. This helps the AI generate more accurate and targeted queries based on the user's specific needs.

Here are a few tips for creating effective example queries and descriptions:

  • Cover a wide range of common use cases and query patterns
  • Provide clear descriptions of what each example query does and what results it returns
  • Use simple, clear language that's easy for both the AI and the user to understand
  • Include comments in your example queries to explain any complex logic or calculations
  • Regularly review and update your example queries based on user feedback and new use cases

To illustrate how semantic search can be used to inject relevant example queries, let's look at a simple code snippet:

# Example queries and their descriptions
example_queries = [
   {
       "query": "SELECT * FROM customers WHERE country = 'USA'",
       "description": "Retrieve all customers from the USA"
   },
   {
       "query": "SELECT product_name, SUM(quantity) as total_sold FROM orders GROUP BY product_name ORDER BY total_sold DESC",
       "description": "Get the total quantity sold for each product, sorted by most sold"
   },
   
# ... more example queries ...
]

# User's natural language query
user_query = "Show me the top 5 products by revenue in the last 30 days"

# Perform semantic search to find the most relevant example queries
relevant_examples = semantic_search(user_query, example_queries)

# Inject the relevant examples into the prompt
prompt = f"""
User query: {user_query}

Relevant example queries:
{relevant_examples}

Please generate an SQL query to answer the user's question based on the provided examples.
"""

In this example, we have a list of example queries and their descriptions. When the user enters a natural language query, we use semantic search to find the most relevant example queries based on the user's intent. Those relevant examples are then injected into the prompt, providing the AI with the necessary context to generate an accurate and targeted SQL query.

By providing relevant example queries and using semantic search to find the most appropriate ones for each user input, you can significantly improve the accuracy and efficiency of your AI-powered database querying. In the next section, we'll discuss the importance of creating a reference guide for your database structure and nuances.

Best Practice 2: Provide a Reference Guide for Database Structure and Nuances

An example of creating a reference guide with Locusive

Another crucial aspect of using AI for database querying is providing the AI with a clear understanding of your database structure and its nuances. Even with relevant example queries, the AI may struggle to generate accurate queries if it doesn't have a solid grasp of how your tables and columns are organized and related to each other.

This is where a reference guide comes in. A reference guide is essentially a manual that describes your database schema, including tables, columns, relationships, and any specific rules or patterns that apply to your data.

By creating a comprehensive reference guide, you're giving the AI a roadmap to navigate your database effectively. This helps the AI make more informed decisions when generating queries, as it can take into account the specific structure and constraints of your data.

Here are some key elements to include in your reference guide:

  • A high-level overview of your database: Provide an overview of how your database is structured
  • How to join tables: Provide the system with information on how to join tables when certain queries come in
  • Data definitions and jargon: Tell the system about the business-specific details that are in your database
  • Common query patterns: Examples of frequently used query patterns specific to your database, such as joining certain tables or filtering based on specific conditions.

A well-crafted reference guide can significantly improve the accuracy and context understanding of your AI-powered database querying. By providing the AI with a clear and comprehensive understanding of your database structure, you're reducing the chances of generating incorrect or inefficient queries.

Here are some tips for creating an effective reference guide:

  • Use clear, concise language that's easy for both the AI and users to understand
  • Organize the guide in a logical and easily navigable manner, such as by grouping related tables or concepts together
  • Include visual aids like diagrams and examples to help illustrate complex relationships or query patterns
  • Keep the guide up to date as your database schema evolves over time
  • Make the guide easily accessible to users and the AI, either by integrating it directly into your querying tool or storing it in a centralized location

To illustrate the impact of using a reference guide, let's consider a real-world example. Imagine a company that uses AI to help its marketing team query a database of customer information and sales data. Without a reference guide, the AI might struggle to understand how the various tables and columns relate to each other, leading to incorrect or incomplete query results.

By creating a comprehensive reference guide that describes the structure and relationships of the marketing database, the company can ensure that the AI has the necessary context to generate accurate and relevant queries. For instance, the guide might explain how the "customers" and "sales" tables are related through a foreign key, or how certain columns like "date" or "product_category" can be used to filter and aggregate data in meaningful ways.

With this reference guide in hand, the marketing team can use natural language queries to quickly and easily retrieve the data they need, without worrying about the complexities of the underlying database structure. This can lead to significant time savings and more data-driven decision making across the organization.

In the next section, we'll discuss the third best practice: injecting your database schema and its description into the prompt for even more accurate and context-aware AI-powered querying.

Best Practice 3: Inject Schema and Its Description into the Prompt

An example of how Locusive lets you include or exclude tables from your schema

While providing example queries and a reference guide can greatly improve the accuracy and context awareness of AI-powered database querying, there's one more best practice that can take your queries to the next level: injecting your database schema and its description directly into the prompt.

Injecting schema information into the prompt means providing the AI with a detailed description of your database structure every time it generates a query. This includes information about your tables, columns, data types, and relationships, all formatted in a way that's easy for the AI to parse and understand.

But why is this so important? By giving the AI direct access to your schema information in the prompt, you're essentially providing it with a cheat sheet to navigate your database. The AI can use this information to make more informed decisions about which tables and columns to use in the query, how to join them together, and what conditions to apply to filter and aggregate the data.

Here are some tips for effectively describing and formatting your schema information in the prompt:

  • Use a clear and consistent format, such as JSON or YAML, to describe your tables and columns
  • Include information about data types, primary and foreign keys, and any constraints or default values
  • Use descriptive names for your tables and columns that clearly convey their purpose and contents
  • Organize your schema description in a logical and easily navigable manner, such as by grouping related tables together
  • Provide examples of common query patterns or relationships that are specific to your database

Here's an example of how you might inject schema information into a prompt using JSON:

{
 "schema": [
   {
     "table": "customers",
     "columns": [
       {
         "name": "customer_id",
         "type": "integer",
         "primary_key": true
       },
       {
         "name": "first_name",
         "type": "varchar(50)"
       },
       {
         "name": "last_name",
         "type": "varchar(50)"
       },
       {
         "name": "email",
         "type": "varchar(100)",
         "unique": true
       }
     ]
   },
   {
     "table": "orders",
     "columns": [
       {
         "name": "order_id",
         "type": "integer",
         "primary_key": true
       },
       {
         "name": "customer_id",
         "type": "integer",
         "foreign_key": {
           "table": "customers",
           "column": "customer_id"
         }
       },
       {
         "name": "order_date",
         "type": "date"
       },
       {
         "name": "total_amount",
         "type": "decimal(10,2)"
       }
     ]
   }
 ]
}

In this example, we're providing the AI with a detailed description of two tables: "customers" and "orders". For each table, we're specifying the column names, data types, primary keys, foreign keys, and any unique constraints. This gives the AI a clear understanding of how the tables are structured and related to each other.

By injecting this schema information into the prompt, along with the user's natural language query and any relevant example queries, we're giving the AI all the tools it needs to generate an accurate and optimized SQL query.

The benefits of injecting schema information into the prompt are significant:

  • Improved query accuracy: By providing the AI with a clear understanding of your database structure, you're reducing the chances of generating queries with syntax errors, incorrect column names, or invalid join conditions.
  • Reduced errors and edge cases: With direct access to your schema information, the AI can catch potential errors or edge cases early on, such as trying to query a non-existent table or column.
  • Faster query generation: By having all the necessary information readily available in the prompt, the AI can generate queries more quickly and efficiently, without needing to make additional API calls or database lookups.
  • More context-aware results: By understanding the relationships and constraints between your tables and columns, the AI can generate more contextually relevant and meaningful query results.

Injecting your database schema and its description into the prompt is a powerful best practice that can take your AI-powered database querying to new heights. Combined with relevant example queries and a comprehensive reference guide, this technique can help you generate highly accurate, efficient, and context-aware queries with minimal effort.

But implementing these best practices can be challenging, especially if you're building your own AI-powered querying system from scratch. That's where tools like Locusive come in, which we'll explore in the next section.

Implementing Best Practices with Locusive

Implementing the best practices we've discussed in this article can be a daunting task, especially if you're building an AI-powered database querying system from scratch. It requires significant time, resources, and expertise to develop and maintain a system that can effectively incorporate example queries, semantic search, reference guides, and schema injection. On top of that, there's a lot more than just these three items that you need to handle when building a system that can properly query your database.

That's where Locusive might help. Locusive is an off-the-shelf solution that simplifies AI-powered database querying by incorporating all of these best practices out of the box. With Locusive, you can quickly and easily enable your team to ask questions and get answers from your database using natural language, without needing to build and maintain your own complex system.

Here's how Locusive incorporates the three best practices we've discussed:

  1. Seamless integration of example queries and semantic search:
    • Locusive allows you to easily provide example queries and their descriptions, which are then used to train the AI to understand your specific use cases and data patterns.
    • The platform uses advanced semantic search techniques to match user queries with the most relevant example queries, ensuring that the AI always has the appropriate context to generate accurate results.
    • As you use Locusive over time, the system learns from your queries and continues to improve its semantic search capabilities, making it even more effective at finding relevant examples.
  2. Automated generation of reference guides:
    • Locusive automatically generates a comprehensive reference guide based on your database structure, including table and column descriptions, entity-relationship diagrams, and data dictionaries.
    • The reference guide is kept up to date in real-time as your database schema evolves, ensuring that the AI always has access to the latest information about your data.
    • Users can easily access the reference guide directly within the Locusive interface, allowing them to quickly understand the structure and relationships of your database without needing to refer to external documentation.
  3. Efficient injection of schema information into prompts:
    • Locusive automatically injects your database schema and its description into every prompt, providing the AI with a clear and comprehensive understanding of your data structure.
    • The platform uses a standardized format for describing schema information, ensuring that the AI can easily parse and understand the data types, relationships, and constraints of your tables and columns.
    • By having schema information readily available in the prompt, Locusive can generate queries more quickly and accurately, without needing to make additional API calls or database lookups.

In addition to these best practices, Locusive offers several other benefits that make it an attractive solution for organizations looking to implement AI-powered database querying:

  • User-friendly interface: Locusive provides an intuitive and easy-to-use interface that allows non-technical users to ask questions and get answers from your database using natural language, without needing to learn complex query languages or syntax.
  • Secure handling of credentials: The platform uses industry-standard encryption and security practices to protect your database credentials and ensure that only authorized users can access your data.
  • Scalability and performance: Locusive is designed to handle large databases and complex queries, with a scalable architecture that can grow with your needs over time.
  • Customization and integration: The platform can be easily customized to fit your specific use cases and data patterns, and can integrate with your existing tools and workflows to provide a seamless user experience.

If you're interested in trying Locusive with your own database, feel free to request a demo and we'll reach out to help you get onboarded, create your reference guides, and your example queries as part of your onboarding.

Conclusion

In this article, we've explored three best practices for using AI to query your database: providing relevant example queries and using semantic search, creating a reference guide for your database structure and nuances, and injecting schema information into the prompt. By implementing these best practices, you can enable your team to ask questions and get accurate, context-aware answers from your database using natural language, without needing to write complex SQL queries or rely on technical experts.

However, implementing these best practices from scratch can be challenging and time-consuming, requiring significant resources and expertise. That's where Locusive comes in – by providing an off-the-shelf solution that incorporates these best practices out of the box, Locusive simplifies the process of implementing AI-powered database querying and saves you time and resources.

If you're interested in exploring how Locusive can help your organization unlock the power of your data and enable your team to make better decisions faster, we encourage you to request a demo or free trial. Our team of experts will work with you to understand your specific needs and use cases, and show you how Locusive can be customized to fit your unique requirements.

As AI continues to revolutionize the way we interact with and analyze data, tools like Locusive will play an increasingly important role in driving adoption and success. By providing a user-friendly, secure, and scalable solution that incorporates best practices for AI-powered database querying, Locusive is helping organizations of all sizes and industries to harness the power of their data and make more informed decisions.

FAQ

  1. How does Locusive ensure the security and privacy of my database information?

    Locusive takes data security and privacy extremely seriously. The platform uses industry-standard encryption and security practices to protect your database credentials and ensure that only authorized users can access your data. All data is encrypted in transit and at rest, and your database credentials are stored separately from the encryption key that encrypts them.
  2. Can Locusive integrate with my existing database infrastructure?

    Yes, Locusive is designed to integrate seamlessly with your existing database infrastructure, regardless of the specific database management system or cloud platform you're using. The platform supports a wide range of popular databases, including PostgreSQL, MySQL, SQL Server, Oracle, and more, and can be easily configured to connect to your database using standard authentication methods.
  3. What kind of support and training does Locusive provide during onboarding?

    Locusive provides comprehensive support and training during the onboarding process to ensure that your team can get up and running quickly and effectively. This includes assistance with configuring your database connection, providing example queries and reference guides specific to your use case, and training your users on how to use the platform to ask questions and get answers from your database. Locusive also provides ongoing support and resources to help you get the most value from the platform over time.
  4. How does Locusive handle complex queries and large datasets?

    Locusive is designed to to use a combination of optimized SQL querying and analysis to handle large data. Aggregation is done in your existing database, and Locusive's system can then automatically write code to analyze the results of queries in its own assistant's runtime process.
  5. Can I customize Locusive to fit my organization's specific needs and branding?

    Yes, we'll help you figure out the best way to approach branding and customization.