From a833300530d6c507a55b08e26c95220f30b590d0 Mon Sep 17 00:00:00 2001 From: Dierk Date: Tue, 19 May 2026 12:02:54 +0200 Subject: [PATCH] Add setup instructions to README for all three backends Covers PostgreSQL database creation and schema setup, Oracle vectors_user setup, and Oracle in-database ONNX model loading. Also updates project structure to include the new sql/ directories. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 94 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index a44918c..10c389c 100644 --- a/README.md +++ b/README.md @@ -70,6 +70,8 @@ vector-search-demo/ ├── stop.sh # Stop all three backends ├── photos/ # 116 JPEG photos (gitignored) ├── pgvector-demo/ +│ ├── sql/ +│ │ └── setup.sql # Create table and HNSW index │ ├── backend/ │ │ ├── .env # PostgreSQL credentials, photo path │ │ ├── db.py # PostgreSQL connection factory @@ -79,6 +81,9 @@ vector-search-demo/ │ └── frontend/ │ └── index.html # Search UI (served at /ui/) └── oravector-demo/ + ├── sql/ + │ ├── setup_vectors_user.sql # Create vectors_user, table and HNSW index + │ └── setup_vector_schema.sql # Create VECTOR user, load ONNX models, FOTO_VEKTOR table ├── backend/ │ ├── .env # Oracle credentials, photo path │ ├── db_oracle.py # Oracle connection factory @@ -113,12 +118,8 @@ vector-search-demo/ cd ~/docker/postgresql && docker compose up -d ``` -The `pgvector/pgvector:pg18` image includes pgvector pre-installed. The extension -must be activated once per database: - -```bash -docker exec postgresql-database-1 psql -U dl -d vectors_demo -c "CREATE EXTENSION vector;" -``` +The `pgvector/pgvector:pg18` image includes pgvector pre-installed. See the +[Setup from scratch](#setup-from-scratch) section for first-time database setup. ### Oracle 26ai (Podman container) @@ -346,6 +347,93 @@ scores in percent. --- +## Setup from scratch + +### 1. PostgreSQL + +**Start the container:** +```bash +cd ~/docker/postgresql && docker compose up -d +``` + +**Create the database:** +```bash +docker exec postgresql-database-1 psql -U dl -d pgdl -c "CREATE DATABASE vectors_demo;" +``` + +**Run the setup script** (creates the pgvector extension, `images` table, and HNSW index): +```bash +docker exec -i postgresql-database-1 psql -U dl -d vectors_demo -f - \ + < pgvector-demo/sql/setup.sql +``` + +**Copy photos and index them:** +```bash +cd pgvector-demo/backend && python3 index_images.py +``` + +--- + +### 2. Oracle 26ai — Python embedding backend + +**Configure vector memory** (once, requires Oracle restart): +```bash +podman exec oracle.free bash -c "sqlplus -s / as sysdba <<'EOF' +ALTER SYSTEM SET vector_memory_size = 512M SCOPE=SPFILE; +SHUTDOWN ABORT; +STARTUP; +EXIT; +EOF" +``` + +**Run the setup script** (creates `vectors_user`, the `images` table, and HNSW index): + +Copy the script into the container and run it as SYSDBA: +```bash +podman cp oravector-demo/sql/setup_vectors_user.sql oracle.free:/tmp/ +podman exec oracle.free bash -c "sqlplus -s / as sysdba @/tmp/setup_vectors_user.sql" +``` + +**Index the photos:** +```bash +cd oravector-demo/backend && python3 index_images_oracle.py +``` + +--- + +### 3. Oracle 26ai — in-database embedding backend + +This backend requires CLIP ONNX models loaded into the Oracle database. The setup +is more involved and is intended to be done once by an administrator. + +**Prerequisites:** +- CLIP ONNX model files (`clip_txt.onnx`, `clip_img.onnx`) present in the Oracle + VEC_DUMP directory inside the container (typically `/opt/oracle/dbs/vec_dump/`) +- The `clip_txt.onnx` model must use **CLS-token pooling** (position 0), not the + standard EOS-token pooling — Oracle's ONNX validator rejects models that use + `ArgMax` on `input_ids`. See the [Oracle in-database embedding](#oracle-in-database-embedding) + section for details. + +**Run the setup script** (creates `VECTOR` user, loads ONNX models, creates `FOTO_VEKTOR` table): +```bash +podman cp oravector-demo/sql/setup_vector_schema.sql oracle.free:/tmp/ +podman exec oracle.free bash -c "sqlplus -s / as sysdba @/tmp/setup_vector_schema.sql" +``` + +**Populate `FOTO_VEKTOR`** with images and their vectors (run as VECTOR user in SQL): +```sql +-- Example: insert one photo with its CLIP_IMG embedding +INSERT INTO vector.foto_vektor (filename, foto, foto_vek) +VALUES ( + 'photo.jpg', + TO_BLOB(BFILENAME('VEC_DUMP', 'photo.jpg')), + VECTOR_EMBEDDING(CLIP_IMG USING TO_BLOB(BFILENAME('VEC_DUMP', 'photo.jpg')) AS data) +); +COMMIT; +``` + +--- + ## Running the applications ### Start all backends