Skip to main content

PostgreSQL

Connect your workflows to PostgreSQL databases for powerful data storage and retrieval capabilities.

What is PostgreSQL?


The PostgreSQL node establishes a connection to a PostgreSQL database, allowing your workflow to store, retrieve, and manipulate data. As a Data Source node, it provides the foundation for connecting to your database and makes it available to other nodes that require database access. PostgreSQL is particularly valuable in AI workflows as it can be extended with pgvector to store vector embeddings for semantic search capabilities.

How to use it?

  1. Add credentials:

    • Click on the PostgreSQL node and select credentials from the dropdown menu
    • You can use either standard username/password authentication or Supabase credentials
    • If you need to add new credentials, go to the settings page of your project or of your organization and add new credentials there.
  2. Set up connection parameters:

    • Connect a String node to the "Postgres Host" input anchor with your database hostname
    • Connect a Number node to the "Postgres Port" input anchor (default PostgreSQL port is 5432)
    • Connect a String node to the "Postgres Database" input anchor with your database name
  3. Connect to compatible nodes:

    • Use the "Postgres" output anchor to connect to nodes like Vector Store Writer or Vector Store Reader
    • The connection provides these nodes with access to your PostgreSQL database

Example of usage

Building a Document Search System with PostgreSQL Vector Storage

This example demonstrates how to create a system that can store document embeddings in PostgreSQL and perform semantic searches to retrieve relevant content.

Step 1: Set up the database connection

First, we'll establish the connection to our PostgreSQL database:

  1. Add the PostgreSQL node to your workflow
  2. Add three input nodes to provide connection details:
    • A Text node with your database host (e.g., "localhost" or "my-db.example.com")
    • A Number node set to 5432 (standard PostgreSQL port)
    • A Text node with your database name (e.g., "vectordb")
  3. Connect these to the corresponding input anchors on the PostgreSQL node
  4. Set up your database credentials in the PostgreSQL node's settings panel

Step 2: Set up document processing

  1. Add a Text node containing the document content you want to process and make searchable
  2. Add a Document Splitter node:
    • Configure it to use "Recursive Character Splitter" as the Type
  3. Add a Text Embedder node:
    • Select an appropriate embedding model (e.g., "amazon.titan-embed-text-v1")
    • Configure the node with your AWS credentials and region

Step 3: Store document embeddings in PostgreSQL

  1. Add a Vector Store Writer node to your workflow:

    • Connect the PostgreSQL node's output to the "Vector Store" input anchor
    • Connect a Text node with "embeddings" to the "Table Name" input
    • Connect a Text node with a reference ID (e.g., "doc-001") to the "Reference" input
    • Connect your document Text node to the "Input Text" input
    • Connect your Text Embedder node to the "Text Embedder" input
    • Connect the Document Splitter node to the "Document Splitter" input
    • Optionally, connect a JSON Object to the "Metadata" input with additional information
  2. Run this portion of the workflow to store your document embeddings in the PostgreSQL database

Step 4: Create the search functionality

  1. Add another Text node with a search query (e.g., "How do machine learning models work?")

  2. Add a Vector Store Reader node:

    • Connect the PostgreSQL node's output to the "Vector Store" input
    • Connect a Text node with "embeddings" to the "Table Name" input
    • Connect a Number node (e.g., value 5) to the "Number of Search Results" input
    • Connect your query Text node to the "Text" input
    • Connect your Text Embedder node to the "Text Embedder" input
    • Set the output format to "Text" or "JSON" based on your preference
  3. Add an Output node:

    • Connect the Vector Store Reader's output to display search results

This setup creates a complete workflow that:

  1. Processes a document and splits it into chunks
  2. Creates vector embeddings for those chunks
  3. Stores the chunks and their vector representations in PostgreSQL
  4. Allows you to search for relevant content using natural language queries
  5. Returns the most semantically similar sections of your document

Additional Information

PostgreSQL Database Setup

Before using the PostgreSQL node for vector operations, ensure your database is properly set up:

  1. Install PostgreSQL 12 or higher
  2. Install the pgvector extension from GitHub: https://github.com/pgvector/pgvector
  3. Create and configure your database:
-- Enable the vector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table to store embeddings
CREATE TABLE embeddings (
id SERIAL PRIMARY KEY,
content TEXT, -- Text content
embedding vector(1536), -- Vector dimension depends on your model
metadata JSONB, -- Optional metadata
reference TEXT, -- Document reference ID
created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Create an index for faster similarity searches (choose one)
-- HNSW index (faster query, slower build)
CREATE INDEX ON embeddings USING hnsw (embedding vector_cosine_ops);
-- OR IVFFlat index (better for smaller datasets)
-- CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Troubleshooting

Common issues and their solutions:

  1. Connection errors:

    • Verify your database host, port, and name are correct
    • Ensure your database user has sufficient privileges
    • Check if your database allows remote connections (pg_hba.conf)
    • Verify network connectivity and firewall rules
  2. Vector operations errors:

    • Confirm pgvector extension is installed and activated
    • Check if your vector dimensions match (e.g., OpenAI embeddings are 1536 dimensions)
  3. Performance issues:

    • Ensure you have proper indexes created for vector similarity searches
    • Consider adjusting PostgreSQL configuration for better performance with vectors

Best Practices

  1. Security:

    • Use a dedicated database user with minimal required permissions
    • Store credentials securely and avoid hardcoding them
    • Enable SSL connections for data in transit protection
  2. Performance:

    • Use appropriate vector indexes based on your data size and query patterns
    • Consider connection pooling for production applications
    • Monitor query performance and adjust as necessary
  3. Data Management:

    • Implement a strategy for managing outdated or irrelevant embeddings
    • Consider partitioning large tables by reference ID or date
    • Regularly backup your vector database

By following this guide, you can effectively use PostgreSQL as a powerful data store for your Nocodo AI workflows, particularly for advanced use cases involving vector embeddings and semantic search capabilities.