Add setup instructions to README for all three backends

Covers PostgreSQL database creation and schema setup, Oracle vectors_user
setup, and Oracle in-database ONNX model loading. Also updates project
structure to include the new sql/ directories.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-19 12:02:54 +02:00
parent d360ff1a78
commit a833300530
+94 -6
View File
@@ -70,6 +70,8 @@ vector-search-demo/
├── stop.sh # Stop all three backends
├── photos/ # 116 JPEG photos (gitignored)
├── pgvector-demo/
│ ├── sql/
│ │ └── setup.sql # Create table and HNSW index
│ ├── backend/
│ │ ├── .env # PostgreSQL credentials, photo path
│ │ ├── db.py # PostgreSQL connection factory
@@ -79,6 +81,9 @@ vector-search-demo/
│ └── frontend/
│ └── index.html # Search UI (served at /ui/)
└── oravector-demo/
├── sql/
│ ├── setup_vectors_user.sql # Create vectors_user, table and HNSW index
│ └── setup_vector_schema.sql # Create VECTOR user, load ONNX models, FOTO_VEKTOR table
├── backend/
│ ├── .env # Oracle credentials, photo path
│ ├── db_oracle.py # Oracle connection factory
@@ -113,12 +118,8 @@ vector-search-demo/
cd ~/docker/postgresql && docker compose up -d
```
The `pgvector/pgvector:pg18` image includes pgvector pre-installed. The extension
must be activated once per database:
```bash
docker exec postgresql-database-1 psql -U dl -d vectors_demo -c "CREATE EXTENSION vector;"
```
The `pgvector/pgvector:pg18` image includes pgvector pre-installed. See the
[Setup from scratch](#setup-from-scratch) section for first-time database setup.
### Oracle 26ai (Podman container)
@@ -346,6 +347,93 @@ scores in percent.
---
## Setup from scratch
### 1. PostgreSQL
**Start the container:**
```bash
cd ~/docker/postgresql && docker compose up -d
```
**Create the database:**
```bash
docker exec postgresql-database-1 psql -U dl -d pgdl -c "CREATE DATABASE vectors_demo;"
```
**Run the setup script** (creates the pgvector extension, `images` table, and HNSW index):
```bash
docker exec -i postgresql-database-1 psql -U dl -d vectors_demo -f - \
< pgvector-demo/sql/setup.sql
```
**Copy photos and index them:**
```bash
cd pgvector-demo/backend && python3 index_images.py
```
---
### 2. Oracle 26ai — Python embedding backend
**Configure vector memory** (once, requires Oracle restart):
```bash
podman exec oracle.free bash -c "sqlplus -s / as sysdba <<'EOF'
ALTER SYSTEM SET vector_memory_size = 512M SCOPE=SPFILE;
SHUTDOWN ABORT;
STARTUP;
EXIT;
EOF"
```
**Run the setup script** (creates `vectors_user`, the `images` table, and HNSW index):
Copy the script into the container and run it as SYSDBA:
```bash
podman cp oravector-demo/sql/setup_vectors_user.sql oracle.free:/tmp/
podman exec oracle.free bash -c "sqlplus -s / as sysdba @/tmp/setup_vectors_user.sql"
```
**Index the photos:**
```bash
cd oravector-demo/backend && python3 index_images_oracle.py
```
---
### 3. Oracle 26ai — in-database embedding backend
This backend requires CLIP ONNX models loaded into the Oracle database. The setup
is more involved and is intended to be done once by an administrator.
**Prerequisites:**
- CLIP ONNX model files (`clip_txt.onnx`, `clip_img.onnx`) present in the Oracle
VEC_DUMP directory inside the container (typically `/opt/oracle/dbs/vec_dump/`)
- The `clip_txt.onnx` model must use **CLS-token pooling** (position 0), not the
standard EOS-token pooling — Oracle's ONNX validator rejects models that use
`ArgMax` on `input_ids`. See the [Oracle in-database embedding](#oracle-in-database-embedding)
section for details.
**Run the setup script** (creates `VECTOR` user, loads ONNX models, creates `FOTO_VEKTOR` table):
```bash
podman cp oravector-demo/sql/setup_vector_schema.sql oracle.free:/tmp/
podman exec oracle.free bash -c "sqlplus -s / as sysdba @/tmp/setup_vector_schema.sql"
```
**Populate `FOTO_VEKTOR`** with images and their vectors (run as VECTOR user in SQL):
```sql
-- Example: insert one photo with its CLIP_IMG embedding
INSERT INTO vector.foto_vektor (filename, foto, foto_vek)
VALUES (
'photo.jpg',
TO_BLOB(BFILENAME('VEC_DUMP', 'photo.jpg')),
VECTOR_EMBEDDING(CLIP_IMG USING TO_BLOB(BFILENAME('VEC_DUMP', 'photo.jpg')) AS data)
);
COMMIT;
```
---
## Running the applications
### Start all backends