## Description of changes
`lifecycle` blocks don't allow variables. Right now our example
deployment for AWS doesn't work. @tazarov has a fix for this and a few
other things in https://github.com/chroma-core/chroma/pull/1173 but I'd
like to get the basic fix out before the weekend.
## Test plan
local `terraform init` failed before, now works.
## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
This proposes an initial CLI
The CLI is installed when you installed `pip install chromadb`.
You then call the CLI with
`chroma run --path <persist_dir> --port <port>` where path and port are
optional.
This also adds `chroma help` and `chroma docs` as convenience links -
but I'm open to removing those.
To make this easy - I added `typer` (by the author of FastAPI). I'm not
sure this is the tool that we want to commit to for a fuller featured
CLI, but given the extremely minimal footprint of this - I don't think
it's a one way door.
<img width="1477" alt="Screenshot 2023-08-23 at 4 59 54 PM"
src="https://github.com/chroma-core/chroma/assets/891664/30374228-d303-41e1-8e9e-188b7f8532d4">
***
#### TODO
- [x] test in fresh env - i think i need to add `typer` as a req
- [ ] consider expanding the test to make sure the service is actually
running
- [x] hide the test option from the typer UI
- [x] linking to a getting started guide could be interesting at the top
of the logs
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Added external volume for Chroma data
- Bumped to the latest version (0.4.9)
- Added auth by default
- Made the template more configurable via variables with sensible
defaults
## Test plan
*How are these changes tested?*
- Tested with terraform
## Documentation Changes
The update contains README with docs.
Refactored and Rebased version of #1059
## Description of changes
This template includes the following:
- Create a security group with required ports open (22 and 8000)
- Create EC2 instance with Ubuntu 22 and deploy Chroma using docker
compose
- Create a data volume (ESB) for Chroma data
- Mount the data volume to the EC2 instance
- Format the data volume with ext4
- Start Chroma
- Enable (by default) Token Auth with randomly generated token
## Test plan
*How are these changes tested?*
- Terraform tests performed
## Documentation Changes
The template contains README with a tutorial on how to use it.
Cherry-picked from #1029
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Added support for `$in` and `$nin` metadata filters
> Note: See CIP in `docs/` or example notebook for more info
## Test plan
*How are these changes tested?*
- [x] Tests pass locally with `pytest` for python
## Documentation Changes
TBD
---------
Co-authored-by: Hammad Bashir <HammadB@users.noreply.github.com>
Refs: #1027
## Description of changes
*Summarize the changes made by this PR.*
- New functionality
- Baseline functionality and tests implemented
- Example notebook updated
- Minor refactor on the client creds provider to allow for user specific
credentials fetching.
## Test plan
*How are these changes tested?*
- [x] Tests pass locally with `pytest` for python (regression)
- [x] New fixtures added for token-based auth
## Documentation Changes
Docs should be updated to highlight the new supported auth method.
## Description of changes
*Summarize the changes made by this PR.*
- New functionality
- Auth Provide Client and Server Side Abstractions
- Basic Auth Provider
## Test plan
Unit tests for authorized endpoints
## Documentation Changes
Docs should change to describe how to use auth providers on the client
and server. CIP added in `docs/`
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Added enhanced examples of how to use `where` filtering with logical
operators based on community questions
## Test plan
Run the jupyter notebook
`examples/basic_functionality/where_filtering.ipynb`
## Documentation Changes
No document updates.
- The project will now clone the chroma repo and deploy docker-compose
from it
- Introduced a new TF var chroma_release to be able to deploy a any (oh
well almost any, chroma version, defaults to 0.4.5)
- Added output instance_public_ip var to print out the IP address of the
VM
Refs: #950
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Made it possible to deploy any version of chroma through TF var
chroma_release (defaults to 0.4.5)
- Made it so that the VM would clone the repo and checkout the release
tag then run docker compose from it
- Added output of the public IP of the VM after TF deployment
- Improved slightly the README.md
## Test plan
*How are these changes tested?*
```bash
gcloud auth application-default login
cd examples/deployments/google-cloud-compute
terraform init
export TF_VAR_project_id=<your_project_id> #take note of this as it must be present in all of the subsequent steps
export TF_VAR_chroma_release=0.4.5 #set the chroma release to deploy
terraform apply -auto-approve
terraform output instance_public_ip
export instance_public_ip=$(terraform output instance_public_ip)
curl -v http://$instance_public_ip:8000/api/v1
terraform destroy -auto-approve
```
## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
## Description of changes
This PR creates a new starter notebook intended to familiarize people
with the very basic, core functionality of embedding retrieval with
Chroma. It's self-contained, and hopefully straightforward and easy to
understand.
There is also a minor fix to the experimental notebook.
## Test plan
Ran the notebook, also via Colab.
## Documentation Changes
None.
## TODO
- [x] https://github.com/chroma-core/chroma/issues/880 Canonical 'chat
with your documents'
## Description of changes
Base PR to release sqlite refactor, which spans many stacked PRs.
Remaining
- [x] Merge this to main
- [x] Layered Persistent Index #761
- [x] Remove old impls (In #781 )
- [x] Remove persist() API (In #787)
- [x] Add telemetry to SegmentAPI, it was not included. (#788)
- [x] New clients #805
- [x] locking and soak tests for thread-safety
- [x] Migration tool
- [x] Fix#739
- [x] Fix metadata None vs empty
- [x] Fix persist directory (addressed in #761)
- [x] Leave files open in #761 (merge stacked PR)
Post Release
- [ ] Un xfail cross version tests once we cut the release
- [x] Documentation updates for new silent ADD failure.
- [x] Update all documentation for new API instantiation
- [x] Update all documentation for settings changes
- [ ] Update terraform deployment
- [ ] Update cloudformation deployment
---------
Co-authored-by: Luke VanderHart <luke@vanderhart.net>
Co-authored-by: Jeffrey Huber <jeff@trychroma.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Sosa <37946988+CakeCrusher@users.noreply.github.com>
Co-authored-by: Russell Pollari <russell@sharpestminds.com>
Co-authored-by: russell-pollari <pollarir@mgail.com>
## Description of changes
*Summarize the changes made by this PR.*
- Improvements & Bug fixes
- Terraform definition was not correct, and Docker containers would not
boot up
- Missing firewall definition to the outside world
## Test plan
*How are these changes tested?*
Run
```angular2html
terraform apply -var="project_id=<your_project_id> -auto-approve"
```
## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
## Description of changes
- Add an optional `instruction` constructor parameter to
InstructorEmbeddingFunction to allow `instruction` and Document pairs to
be encoded.
## Test plan
## Documentation Changes
Added examples to the Alternative Embedding notebook.
Not sure if this is a good implementation, since you'll need a separate
Collection for each instruction you want to use (or reassign
`self._instruction`), but at least the change is pretty minimal. For my
use case, two instructions are enough (one for storing, one for
retrieving). For a scenario where you need lots of different
instructions, perhaps "Represent the <Science|Financial|Political|etc.>
article: ", another solution is needed.
Feature Request #546
---------
Co-authored-by: Jeffrey Huber <jeff@trychroma.com>
* Cleaner user method signatures
* Set model_space automatically on process if not supplied.
* New MNIST notebook
* Display MNIST digits we select
* Removed cruft, more docu
* Return 1k results by default
* Updated notebook working dir.
* cwd is not a real command