873 Commits

Author SHA1 Message Date
Hammad Bashir
c41bfa4ffc [RELEASE] 0.4.13 (#1180)
Release 0.4.13
2023-09-25 09:30:43 -07:00
Hammad Bashir
8a6ad07127 [CHORE] Add support for pydantic v2 (#1174)
## Description of changes
Closes #893 

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Adds support for pydantic v2.0 by changing how Collection model inits
- this simple change fixes pydantic v2
	 - Fixes the cross version tests to handle pydantic specifically
- Conditionally imports pydantic-settings based on what is available. In
v2 BaseSettings was moved to a new package.
 - New functionality
	 - N/A

## Test plan
Existing tests were run with the following configs
1. Fastapi < 0.100, Pydantic >= 2.0 - Unsupported as the fastapi
dependencies will not allow it. They likely should, as pydantic.v1
imports would support this, but this is a downstream issue.
2. Fastapi >= 0.100, Pydantic >= 2.0, Supported via normal imports 
(Tested with fastapi==0.103.1, pydantic==2.3.0)
3. Fastapi < 0.100 Pydantic < 2.0, Supported via normal imports 
(Tested with fastapi==0.95.2, pydantic==1.9.2)
4. Fastapi >= 0.100, Pydantic < 2.0, Supported via normal imports 
(Tested with latest fastapi, pydantic==1.9.2)

- [x] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
None required.
2023-09-25 09:25:39 -07:00
Ben Eggers
c7a0414ea7 [ENH] Metric batching and more metrics (#1163)
## Description of changes
This PR accomplishes two things:
- Adds batching to metrics to decrease load to Posthog
- Adds more metric instrumentation

Each `TelemetryEvent` type now has a `batch_size` member defining how
many of that Event to include in a batch. `TelemetryEvent`s with
`batch_size > 1` must also define `can_batch()` and `batch()` methods to
do the actual batching -- our posthog client can't do this itself since
different `TelemetryEvent`s use different count fields. The Posthog
client combines events until they hit their `batch_size` then fires them
off as one event.

NB: this means we can drop up to `batch_size` events -- since we only
batch `add()` calls right now this seems fine, though we may want to
address it in the future.

As for the additional telemetry, I pretty much copied Anton's draft
https://github.com/chroma-core/chroma/pull/859 with some minor changes.

Other considerations: Maybe we should implement `can_batch()` and
`batch()` on all events, even those which don't currently use them? I'd
prefer not to leave dead code hanging around but happy to go either way.

I created a ticket for the type ignores:
https://github.com/chroma-core/chroma/issues/1169

## Test plan
pytest passes modulo a couple unrelated failures

With `print(event.properties)` in posthog client's `_direct_capture()`:
```
>>> import chromadb
>>> client = chromadb.Client()
{'batch_size': 1}
>>> collection = client.create_collection("sample_collection")
{'batch_size': 1, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'embedding_function': 'ONNXMiniLM_L6_V2'}
>>> collection.add(
...     documents=["This is document1", "This is document2"], # we embed for you, or bring your own
...     metadatas=[{"source": "notion"}, {"source": "google-docs"}], # filter on arbitrary metadata!
...     ids=["doc1", "doc2"], # must be unique for each doc 
... )
{'batch_size': 1, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'add_amount': 2, 'with_documents': 2, 'with_metadata': 2}
>>> for i in range(50):
...   collection.add(documents=[str(i)], ids=[str(i)])
... 
{'batch_size': 20, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'add_amount': 20, 'with_documents': 20, 'with_metadata': 0}
{'batch_size': 20, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'add_amount': 20, 'with_documents': 20, 'with_metadata': 0}
>>> for i in range(50):
...   collection.add(documents=[str(i) + ' ' + str(n) for n in range(20)], ids=[str(i) + ' ' + str(n) for n in range(20)])
... 
{'batch_size': 20, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'add_amount': 210, 'with_documents': 210, 'with_metadata': 0}
{'batch_size': 20, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'add_amount': 400, 'with_documents': 400, 'with_metadata': 0}
{'batch_size': 20, 'collection_uuid': 'bb19d790-4ec7-436c-b781-46dab047625d', 'add_amount': 400, 'with_documents': 400, 'with_metadata': 0}
```

## Documentation Changes
https://github.com/chroma-core/docs/pull/139

a4fd57d4d2
2023-09-22 15:49:58 -07:00
Ben Eggers
317d547d7a Fix failing example terraform (#1175)
## Description of changes

`lifecycle` blocks don't allow variables. Right now our example
deployment for AWS doesn't work. @tazarov has a fix for this and a few
other things in https://github.com/chroma-core/chroma/pull/1173 but I'd
like to get the basic fix out before the weekend.

## Test plan
local `terraform init` failed before, now works.

## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
2023-09-22 15:49:29 -07:00
calvintwr
35991bfe87 export IncludeEnum as it is required by #get and #query (#1167)
## Description of changes

The `IncludeEnum` enum is not exported, cause lint errors when using
`.get` or `.query`, as follows:

```js
const result = await collection.query({
    queryTexts: [query],
    // THIS LINE WILL PRODUCE LINT ERROR as it needs IncludeEnum.Distances etc.
    include: ['distances', 'documents', 'metadatas'],
    nResults: 2,
})
```

## Test plan

Nil

## Documentation Changes

Nil
2023-09-21 10:30:58 -07:00
Trayan Azarov
5436bd5de1 [ENH]: Support for $in and $nin metadata filters (#1151)
Refs: #1105

## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - JS Client support for $in and $nin

## Test plan
*How are these changes tested?*

- [x] Tests pass locally `yarn test` for js

## Documentation Changes
TBD
2023-09-20 10:28:38 -07:00
Hammad Bashir
896822231e [ENH] Pulsar Producer & Consumer (#921)
## Description of changes

*Summarize the changes made by this PR.*
 - New functionality
- Adds a basic pulsar producer, consumer and associated tests. As well
as a docker compose for the distributed version of chroma.

## Test plan
We added bin/cluster-test.sh, which starts pulsar and allows
test_producer_consumer to run the pulsar fixture.

## Documentation Changes
None required.
2023-09-20 02:03:07 -07:00
Hammad Bashir
020950470c [RELEASE] 0.4.12 to fix Dockerfile log issue (#1165)
Releasing a hotfix for #1159 which addresses #1164 which breaks the
docker image
2023-09-20 00:50:31 -07:00
Trayan Azarov
dffc8067db [BUG]: Docker entrypoint logging path (#1159)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Initial CLI PR (https://github.com/chroma-core/chroma/pull/1032) moved
the logging config inside chromadb. If image is built with the current
setup it will result in Error: Invalid value for '--log-config': Path
'log_config.yml' does not exist.

## Test plan
*How are these changes tested?*

Steps to reproduce (prior to this PR):

- `docker build -t chroma:canary .`
- `docker run --rm -it chroma:canary`


## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
2023-09-20 00:47:16 -07:00
Hammad Bashir
7d2dd011cf Release 0.4.11 (#1162)
Release 0.4.11
2023-09-19 09:44:15 -07:00
Hammad Bashir
f3284f62b9 [RELEASE] JS 1.5.11 (#1161)
Releases Js 1.5.11
2023-09-19 09:34:03 -07:00
Trayan Azarov
aa0387a6d7 [BUG]: Added cohere version 6.x support in peer dependencies (#1156)
Refs: #1104

## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Expanding Cohere version also to support 6.x This plays nice with the
rest of the ecosystem

## Test plan
*How are these changes tested?*

- [x] `yarn test` for js

## Documentation Changes
N/A
2023-09-19 09:14:16 -07:00
Jeff Huber
b930a862ba js 1.5.10 (#1155)
Release https://github.com/chroma-core/chroma/pull/1153
2023-09-18 21:18:25 -07:00
Trayan Azarov
3aed7b78b2 [BUG]: Fixed BF index overflow issue with subsequent delete (#1150)
Refs: #989

## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- When the BF index overflows (batch_size upon insertion of large batch
it is cleared, if a subsequent delete request comes to delete Ids which
were in the cleared BF index a warning is raised for non-existent
embedding. The issue was resolved by separately checking if BF the
record exists in the BF index and conditionally execute the BF removal

## Test plan
*How are these changes tested?*

- [x] Tests pass locally with `pytest` for python

## Documentation Changes
N/A
2023-09-18 17:43:13 -07:00
Trayan Azarov
9e05cb372a [BUG]: Fixing broken peer deps (#1153)
Refs: #1104

## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Removed transformers and web-ai peer dependencies.

## Test plan
*How are these changes tested?*

- [ ] Manual testing - `mkdir testproject && cd testproject && npm init
-y && npm link chromadb && npm add langchain`


## Documentation Changes
N/A
2023-09-18 15:56:53 -07:00
Trayan Azarov
82b9c830f7 [ENH]: CIP-5: Large Batch Handling Improvements Proposal (#1077)
- Including only CIP for review.

Refs: #1049

## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - New proposal to handle large batches of embeddings gracefully

## Test plan
*How are these changes tested?*

- [ ] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
TBD

---------

Signed-off-by: sunilkumardash9 <sunilkumardash9@gmail.com>
Co-authored-by: Sunil Kumar Dash <47926185+sunilkumardash9@users.noreply.github.com>
2023-09-18 13:00:57 -07:00
Trayan Azarov
2b434b8266 [ENH]: JS Client Static Token support (#1114)
Refs: #1083

## Description of changes

*Summarize the changes made by this PR.*
 - New functionality
- JS Client now supports Authorization, and X-Chroma-Token auths
supported
         - Tests and integration tests updated

## Test plan
*How are these changes tested?*

- [x] Tests pass locally `yarn test` for js

## Documentation Changes
TBD
2023-09-18 11:30:25 -07:00
Leonid Ganeline
dac67e7ca8 simplified ut-s (#1071)
## Description of changes

 - Improvements 
	 - simplified ut-s
- cleaned up a typing import

## Test plan

- [+] Tests passed successfully locally with `pytest` for python, `yarn
test` for js

## Documentation Changes
 N/A
2023-09-18 10:45:34 -07:00
Jeff Huber
a0a3c35217 bump JS to 1.5.9 (#1145)
Bump to release to 1.5.9 to release
https://github.com/chroma-core/chroma/pull/1142
2023-09-15 13:32:41 -07:00
Jacob Lee
d090ca6f6f Fix broken peer OpenAI dep dependency range (#1142)
https://semver.npmjs.com/

<img width="1362" alt="Screenshot 2023-09-13 at 3 51 28 PM"
src="https://github.com/chroma-core/chroma/assets/6952323/ab7752f2-9a82-4ea6-8a9b-f92b13b124b0">

<img width="1414" alt="Screenshot 2023-09-13 at 3 51 16 PM"
src="https://github.com/chroma-core/chroma/assets/6952323/c7761348-e499-4541-a775-b5436abff749">

npm strictly checks peer dep ranges, which means the `npm install` of
anything with a peer dep on Chroma was affected by this.
2023-09-15 10:13:30 -07:00
Sunil Kumar Dash
6681df91bf Enable manual workflow trigger (#1036)
## Description of changes

*Summarize the changes made by this PR.*
 - Added a workflow_dispatch to manually trigger test workflows
 - will be good for development experience

---------

Signed-off-by: sunilkumardash9 <sunilkumardash9@gmail.com>
2023-09-14 05:13:18 +00:00
Trayan Azarov
831c027f5c [SEC]: Bandit Scan (#1113)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Added bandit scanning for all pushes to repo

## Test plan
*How are these changes tested?*

Manual testing of the workflow

## Documentation Changes
N/A - unless we want to start a separate security section in the main
docs repo.

---------

Co-authored-by: Hammad Bashir <HammadB@users.noreply.github.com>
2023-09-11 20:49:55 -07:00
Jeff Huber
7d412aef8c [ENH] initial CLI (#1032)
This proposes an initial CLI

The CLI is installed when you installed `pip install chromadb`.

You then call the CLI with

`chroma run --path <persist_dir> --port <port>` where path and port are
optional.

This also adds `chroma help` and `chroma docs` as convenience links -
but I'm open to removing those.

To make this easy - I added `typer` (by the author of FastAPI). I'm not
sure this is the tool that we want to commit to for a fuller featured
CLI, but given the extremely minimal footprint of this - I don't think
it's a one way door.

<img width="1477" alt="Screenshot 2023-08-23 at 4 59 54 PM"
src="https://github.com/chroma-core/chroma/assets/891664/30374228-d303-41e1-8e9e-188b7f8532d4">

***

#### TODO
- [x] test in fresh env - i think i need to add `typer` as a req
- [ ] consider expanding the test to make sure the service is actually
running
- [x] hide the test option from the typer UI
- [x] linking to a getting started guide could be interesting at the top
of the logs
2023-09-11 20:49:25 -07:00
Hammad Bashir
8e967304d6 [RELEASE] 0.4.10 (#1132)
Release 0.4.10
2023-09-11 11:09:02 -07:00
Hammad Bashir
9db68045e6 [CHORE] Bump HNSWlib to latest version that has precompiled binaries (#1109)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Bump HNSWlib to latest version that has precompiled binaries. Use
alpha release for CI tests before releasing

## Test plan
Existing tests should over functionality. Build compatibility of the
binaries was manually verified.
- [x] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
We should add how to force recompiling with AVX to the docs.
2023-09-11 16:49:36 +00:00
Trayan Azarov
2dd5a15526 [ENH] Added auth and external volume support for GCP (#1107)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Added external volume for Chroma data
         - Bumped to the latest version (0.4.9)
         - Added auth by default
- Made the template more configurable via variables with sensible
defaults

## Test plan
*How are these changes tested?*

- Tested with terraform

## Documentation Changes
The update contains README with docs.
2023-09-11 08:19:57 -07:00
Trayan Azarov
9c1979c931 [BUG]: URL Parsing And Validation (#1118)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Added additional validations to URLs - URLs like api-gw.aws.com/dev
will now trigger an error asking the user to correctly specify the URL
with http or https
- When the full URL (http(s)://example.com) is provided by the user, the
port parameter is ignored (debug message is logged). An assumption is
made that the URL is entirely defined, thus not requiring additional
alterations such as injecting the port.
    - Added negative test cases for invalid URLs

## Test plan
*How are these changes tested?*

- [x] Tests pass locally with `pytest` for python

## Documentation Changes
TBD
2023-09-11 07:58:20 -07:00
Trayan Azarov
ea73f05bdf [BUG]: Issue where In/Nin list values (#1111)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Fixed an issue where list values for In/Nin that are not wrapped with
pypika ParameterValue will result in floating point comparisons failure
after a certain precision threshold.

## Test plan
*How are these changes tested?*

- [x] Tests pass locally with `pytest` for python

## Documentation Changes
N/A
2023-09-07 13:24:41 -07:00
Hammad Bashir
237b3e3c96 [BLD] Add dockerhub support (#1112)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Pushes images to dockerhub

## Test plan
*How are these changes tested?*
Will have to be tested on main as part of CI/CD
- [x] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
None required.
2023-09-06 23:13:29 -07:00
Jeff Huber
3241de7a6f update JS instructions (#960)
Improve develop instructions for the JS client

---------

Co-authored-by: Pascal M <11357019+perzeuss@users.noreply.github.com>
Co-authored-by: Hammad Bashir <HammadB@users.noreply.github.com>
2023-09-06 22:19:41 -07:00
hammadb
747f7c6457 [BLD] Update release workflow tag check 2023-09-05 22:26:52 -07:00
Hammad Bashir
327accb2de [RELEASE] Release 0.4.9 (#1100)
## Description of changes
Release 0.4.9
2023-09-05 22:19:02 -07:00
Jeff Huber
e862ab207f remove unused bin/build (#1099)
closes https://github.com/chroma-core/chroma/issues/1003
2023-09-06 04:57:19 +00:00
hammadb
3e83d174eb [BLD] Make python release processes add a check tag step 2023-09-05 19:04:06 -07:00
hammadb
242d165554 [RELEASE] Release js 1.5.8 2023-09-05 17:21:09 -07:00
Hammad Bashir
c3075ef31a [RELEASE] Release JS 1.5.7 (#1090)
Release JS 1.5.7
2023-09-05 16:50:16 -07:00
hammadb
2acc057d85 [BLD] Switch to version prefixed tags to be able to use github filter patterns on actions. v0.4.9 instead of 0.4.9 2023-09-05 16:49:47 -07:00
hammadb
3729885376 [BLD] JS release token fix. 2023-09-05 16:43:38 -07:00
hammadb
27ece76d36 [BLD] JS release token fix. Python tag push fix 2023-09-05 16:41:13 -07:00
hammadb
b0e75e677d [BLD] js_release workflow fix chroma port 2023-09-05 16:29:56 -07:00
hammadb
096e52280e [BLD] js_release workflow fix chroma port 2023-09-05 16:23:40 -07:00
hammadb
07aa62e098 [BLD] fix run typo 2023-09-05 16:11:33 -07:00
Hammad Bashir
347b8be4a9 [BLD] Add npm run db:run to run docker-compose up (#1097)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Add npm run db:run to run docker-compose up
 - New functionality
	 - ...

## Test plan
Will be tested manually. This is an iteration on #1095 .
- [x] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
None
2023-09-05 16:02:50 -07:00
Hammad Bashir
687375ae86 [BLD] Add install deps to JS release workflow (#1095)
Add install deps to JS release workflow
2023-09-05 15:51:19 -07:00
Hammad Bashir
d7e3d82d6f [BLD] Release env vars save to GITHUB_ENV (#1094)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Properly use github_env to pass env vars across steps 
-
https://docs.github.com/en/github-ae@latest/actions/using-workflows/workflow-commands-for-github-actions#setting-an-environment-variable


## Test plan
Will be manually tested, this is an iteration on #1093 
- [ ] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
None
2023-09-05 15:42:46 -07:00
Hammad Bashir
8b41722b34 [BLD] Prevent python release from running on js tag pushes (#1093)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Prevent python release from running on js tag pushes

## Test plan
Manually on main via triggering

## Documentation Changes
None
2023-09-05 15:16:25 -07:00
Hammad Bashir
50f31656c0 [BLD] Add working directory to npm run js (#1092)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Add working directory to npm run js
 - New functionality
	 - ...

## Test plan
These must be tested after merge, this is an iteration on #1091 

- [x] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
None required.
2023-09-05 15:09:30 -07:00
hammadb
2c8a11f7b2 [BLD] Add working directory to npm run js 2023-09-05 15:07:45 -07:00
Hammad Bashir
986fc38c98 [BLD] Add release automation for JS client (#1091)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
	 - Adds release automation for JS client. 

## Test plan
They will have to be tested after merging this PR

## Documentation Changes
Update the js client Develop guide for how to use this automation.
2023-09-05 15:04:31 -07:00
Hammad Bashir
6c700ebdfc Add explicit env to docker compose so that it can hint at usage (#1088)
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Add explicit env for persist directory defaults to docker compose so
that it can hint at usage
 - New functionality
	 - ...

## Test plan
Existing tests
- [x] Tests pass locally with `pytest` for python, `yarn test` for js

## Documentation Changes
None required
2023-09-05 12:48:57 -07:00