This PR has a few fixes
- fix the JS examples to `1.5.1` and add documentation for CORS
- add a test for `queryTexts` with a stubbed out embedding function
## Description of changes
New API implementation backed by the segment-based architecture. Should
be extensible to a full distributed architecture.
---------
Co-authored-by: Jeffrey Huber <jeff@trychroma.com>
Co-authored-by: hammadb <hammad@trychroma.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Hammad Bashir <HammadB@users.noreply.github.com>
connect #725
## Description of changes
Add a new `fetchOptions` parameter on the JS ChromaClient creation that
allows you to pass in custom `fetch` options.
Example:
```
const chroma = new ChromaClient({
path: "http://localhost:8000",
fetchOptions: {
headers: {
'X-Api-Token': "sk-live-Hunt3r2", // Works like regular node-fetch headers!
}
}
});
```
## Test plan
The existing test suite was run and validated as passing.
## Documentation Changes
Typescript would allow the developer to know that these options are
available. I do not think it currently requires updates to the Chroma
Documentation - especially since this change is only impacting the JS
client.
---------
Co-authored-by: Jeffrey Huber <jeff@trychroma.com>
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Fixes cohere embeddings not using the api key.
## Test plan
* I tested that cohere embeddings returned a result
* Without this change the result is undefined
## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
no
## Description of changes
*Summarize the changes made by this PR.*
- New functionality
- Add support for
[Transformers.js](https://github.com/xenova/transformers.js). This
allows users to run Hugging Face transformers directly from their JS
applications. For more information, see
https://huggingface.co/docs/transformers.js.
## Test plan
*How are these changes tested?*
I followed the [JS "Getting started"
guide](https://docs.trychroma.com/getting-started?lang=js) to ensure it
works as intended. Here's the script, and its output:
```js
const {ChromaClient, TransformersEmbeddingFunction} = require('chromadb');
const client = new ChromaClient();
// Create the embedder. In this case, I just use the defaults, but you can change the model,
// quantization, revision, or add a progress callback, if desired.
const embedder = new TransformersEmbeddingFunction({ /* Configuration goes here */ });
const main = async () => {
// Empties and completely resets the database.
await client.reset()
// Create the collection
const collection = await client.createCollection({name: "my_collection", embeddingFunction: embedder})
// Add some data to the collection
await collection.add({
ids: ["id1", "id2", "id3"],
metadatas: [{"source": "my_source"}, {"source": "my_source"}, {"source": "my_source"}],
documents: ["I love walking my dog", "This is another document", "This is a legal document"],
})
// Query the collection
const results = await collection.query({
nResults: 2,
queryTexts: ["This is a query document"]
})
console.log(results)
// {
// ids: [ [ 'id2', 'id3' ] ],
// embeddings: null,
// documents: [ [ 'This is another document', 'This is a legal document' ] ],
// metadatas: [ [ [Object], [Object] ] ],
// distances: [ [ 1.0109775066375732, 1.0756263732910156 ] ]
// }
}
main();
```
## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
I just added some type annotations for the constructor.
## Description of changes
It turns out `isinstance(bool, int)` returns True. We need to test for
`bool` separately.
Addresses the first part of
https://github.com/chroma-core/chroma/issues/671
## Test plan
A new test, `test_boolean_metadata`, marked xfail.
## Documentation Changes
None. This now matches the documentation.
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
## Description of changes
- Improvements & Bug fixes
- Fix incorrect types for `embeddings` and `distances` in
`QueryResponse`
- closes: #660
## Test plan
Tests ran and passed
## Documentation Changes
None
Co-authored-by: russell-pollari <pollarir@mgail.com>
## Description of changes
Added a new JavaScript embedding function to generate embeddings for
text and images using the [Web AI](https://github.com/visheratin/web-ai)
library. The users will now be able to generate the embeddings in both
NodeJS and (IMO more importantly) in the browser. This, along with the
existing JS library design, enables the decoupling of embeddings
generation from storage (see example demo below).
## Test plan
I created a demo to check how the function works in web apps -
https://web-ai-chroma.vercel.app/
The embeddings are generated on the client side and sent to the server
to add to the collection or query for similar items.
## Documentation Changes
Likely this change will require changes in the documentation.
## Description of changes
When fetching a collection that does not exist, the JS client would fail
silently, returning a collection with `undefined` for all its
attributes.
- Improvements & Bug fixes
- `ChromaClient.getCollection` will throw an error if there is an error
in the API response
- closes: #634
## Test plan
Passed tests with `$ yarn test`
## Documentation Changes
None
---------
Co-authored-by: russell-pollari <pollarir@mgail.com>
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Typing cleanup
- Changes embedding function defaults to work through the default params
as opposed to via None so that None now means - NO embedding function as
opposed to use the default. This is technically a breaking change.
- New functionality
- Adds a thin client that restricts what you can create to the REST api
client and builds it separately so it can be published to its own pypi
package. The thin client restricts the default embedding function to be
None always - forcing manual specification of the embedding function
while using the thin client.
- The thin client is built with its own pyproject.toml with a limited
set of dependencies and a is_thin_client.py file that acts as a compile
flag. The build script stages the toml, places the two files in the
right place, performs the build and then tears down the changes.
Addresses #289
## Test plan
The existing tests should cover this configuration. We can add CI for
the thin-client in the future.
## Documentation Changes
We will add a section on the docs that explains the thin client and its
limitations.
## Description of changes
Please let me know if anything needs to be changed or done differently!
We can certainly continue to make efforts to make our JS client more
structured and maintainable.
*Summary:*
- Improvements & Bug fixes
- Broke `client/js/src/index.ts` down and put them in separate files
- Created `IEmbeddedFunction` interface to improve the maintainability
and extensibility of the client code, making it easier to add new
embedding function types in the future
- Replaced `CallableFunction` with `IEmbeddingFunction`
- Closes#536
- New functionality
- None
## Test plan
Refactored tests
**Note:** the Collection tests have been failing on main when I run
using `yarn test`, did we know about this?
## Documentation Changes
Not sure, couldn't find anything?
In this PR
- docstrings!
- types!
- named params!
This is a breaking change to the JS Client - it switches from positional
to named arguments originally brought up by @transitive-bullshit in
https://github.com/chroma-core/chroma/issues/254
Example:
```
// before
const collection = await chroma.createCollection("test");
//after
const collection = await chroma.createCollection({ name: "test" });
```
This is especially nice when you are only using a few params.
This PR does not bump the `npm` version number - we will cut a new
release on Monday.
We will need to land an update to our docs at that time as well. TBD
Adds precommit hooks based on #433 to our repository. Only one file here is new - the configuration for the hooks, everything else is linting/formatting fixes. We do not run the typechecker globally since that would be quite lengthy to clean up - instead we will have to incrementally clean up type check issues as we go.
Class OpenAIEmbeddingFunction() is initiated with an optional parameter - model (string). But inside the generate() function of the class, the optional parameter is forgotten to be used when making API Call to OpenAI. Instead of the parameter, a string literal "text-embedding-ada-002" is used.