Alusus language bindings for the FAISS library - A library for efficient similarity search and clustering of dense vectors.
This library provides Alusus bindings to FAISS, enabling high-performance vector similarity search and clustering operations in the Alusus programming language.
import "Apm";
Apm.importFile("Alusus/Faiss");
use Faiss;
import "Srl/Console";
import "Srl/Array";
import "Apm";
Apm.importFile("Alusus/Faiss");
use Srl;
use Faiss;
// Create a flat index with 4-dimensional vectors
def index: ref[Index];
Index.new(index, 4, "Flat", MetricType.METRIC_INNER_PRODUCT);
// Add vectors to the index
def xb: Array[Float]({1.0, 2.0, 3.0, 4.0, 2.0, 3.0, 4.0, 5.0});
index.add(2, xb.buf); // 2 vectors
// Search for nearest neighbors
def xq: Array[Float]({1.5, 2.5, 3.5, 4.5});
def labels: array[Int[64], 3];
def distances: array[Float, 3];
index.search(1, xq.buf, 3, distances, labels); // Find 3 nearest neighbors
// Clean up
Index.free(index);
See complete examples in the Examples/ directory.
This library wraps the FAISS C API. For detailed documentation of concepts, algorithms, and best practices, please refer to the official FAISS documentation:
- Main Documentation: https://github.com/facebookresearch/faiss/wiki
- C API Reference: https://github.com/facebookresearch/faiss/blob/main/c_api/
- Getting Started Tutorial: https://github.com/facebookresearch/faiss/wiki/Getting-started
- Index Selection Guide: https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index
Main index class for similarity search. C API docs
Factory method:
Index.new(obj: ref[ref[Index]], d: Int, description: CharsPtr, metric: Int): Int- Create index using factory string
Key methods:
train(n: Int[64], x: ref[array[Float]]): Int- Train the index on dataadd(n: Int[64], x: ref[array[Float]]): Int- Add vectors to indexsearch(n: Int[64], x: ref[array[Float]], k: Int[64], distances: ref[array[Float]], labels: ref[array[Int[64]]]): Int- Search for k nearest neighborsrangeSearch(n: Int[64], x: ref[array[Float]], radius: Float, result: ref[RangeSearchResult]): Int- Range searchreset(): Int- Remove all vectors from indexremoveIds(sel: ref[IdSelector], nRemoved: ref[ArchWord]): Int- Remove specific vectors
Properties:
d: Int[64]- Vector dimensionnTotal: Int[64]- Total number of indexed vectorsisTrained: Int- Whether index is trained (0 or 1)metricType: MetricType- Distance metric being usedverbose: Int- Verbosity level
Cleanup:
Index.free(obj: ref[Index])- Free index memory
Brute-force index performing exact search. Guide
Creation:
IndexFlat.new(obj: ref[ref[IndexFlat]]): IntIndexFlat.new(obj: ref[ref[IndexFlat]], d: Int[64], metric: MetricType): Int
Additional methods:
getXb(outXb: ref[ref[array[Float]]], outSize: ref[ArchWord])- Get stored vectorscomputeDistanceSubset(n: Int[64], x: ref[array[Float]], k: Int[64], outDistances: ref[array[Float]], labels: ref[array[Int[64]]]): Int- Compute distances to subset
Inherits all Index methods.
Flat index specialized for inner product metric. Docs
Creation:
IndexFlatIp.new(obj: ref[ref[IndexFlatIp]]): IntIndexFlatIp.new(obj: ref[ref[IndexFlatIp]], d: Int[64]): Int
Flat index specialized for L2 (Euclidean) distance. Docs
Creation:
IndexFlatL2.new(obj: ref[ref[IndexFlatL2]]): IntIndexFlatL2.new(obj: ref[ref[IndexFlatL2]], d: Int[64]): Int
Inverted file index for faster approximate search. Guide
Additional properties:
nList: ArchWord- Number of inverted lists (clusters)nProbe: ArchWord- Number of clusters to visit during search (tunable)quantizer: ref[Index]- Quantizer indexownFields: Int- Whether index owns its fields
Additional methods:
mergeFrom(other: ref[IndexIvf], addId: Int[64]): Int- Merge another IVF indexcopySubsetTo(other: ref[IndexIvf], subsetType: Int, a1: Int[64], a2: Int[64]): Int- Copy subset of vectorsgetListSize(listNo: ArchWord): ArchWord- Get size of inverted listmakeDirectMap(newMaintainDirectMap: Int): Int- Create direct map for reconstructionimbalanceFactor: Float[64]- Get cluster imbalance factorprintStats()- Print index statistics
Index for binary (hamming) vectors. Guide
Similar to Index but operates on binary vectors (Word[8] arrays instead of Float arrays).
Manages index parameters for grid search and tuning. C API
Methods:
new(parameterSpace: ref[ref[ParameterSpace]]): IntsetIndexParameter(index: ref[Index], paramName: CharsPtr, val: Float[64]): Int- Set single parametersetIndexParameters(index: ref[Index], params: CharsPtr): Int- Set multiple parametersaddRange(name: CharsPtr, outRange: ref[ref[ParameterRange]]): Int- Add parameter range
Runtime search parameters. C API
Methods:
new(obj: ref[ref[SearchParameters]], sel: ref[IdSelector]): IntnProbe: Int- Number of clusters to probe (for IVF indexes)
Extended search parameters for IVF indexes.
Methods:
new(obj: ref[ref[SearchParametersIvf]]): Intnew(obj: ref[ref[SearchParametersIvf]], sel: ref[IdSelector], nprobe: ArchWord, maxCodes: ArchWord): Int
Properties:
sel: ref[IdSelector]- ID selectornProbe: ArchWord- Number of clusters to probemaxCodes: ArchWord- Maximum codes to scan
K-means clustering implementation. C API
Creation:
new(out: ref[ref[Clustering]], d: Int, k: Int): Int- Create with dimension and k clustersnew(out: ref[ref[Clustering]], d: Int, k: Int, params: ptr[ClusteringParameters]): Int- Create with parameters
Methods:
train(n: Int[64], x: ref[Float], index: ref[Index]): Int- Run k-meansgetCentroids(centroids: ref[ref[array[Float]]], size: ref[ArchWord])- Get cluster centroidsgetIterationStats(stats_out: ref[ref[ClusteringIterationStats]], size: ref[ArchWord])- Get iteration statistics
Properties:
niter: Int- Number of iterationsnredo: Int- Number of k-means restartsk: ArchWord- Number of clustersd: ArchWord- Vector dimension
Select subsets of vectors by ID. C API
Variants:
IdSelectorBatch- Select specific IDs from a listIdSelectorRange- Select IDs in a rangeIdSelectorBitmap- Select using a bitmapIdSelectorNot- Invert a selectorIdSelectorAnd- Combine selectors with ANDIdSelectorOr- Combine selectors with ORIdSelectorXor- Combine selectors with XOR
Results from range search queries. C API
Methods:
new(obj: ref[ref[RangeSearchResult]], nq: Int[64]): IntdoAllocation(): Int- Allocate result buffersbufferSize(): ArchWord- Get buffer sizegetLims(outLims: ref[ref[array[ArchWord]]])- Get result limits arraygetLabels(outLabels: ref[ref[array[Int[64]]]], outDistances: ref[ref[ref[Float]]])- Get labels and distances
Compute distances to vectors. C API
Methods:
setQuery(x: ref[array[Float]]): Int- Set query vectorvectorToQueryDis(i: Int[64], qd: ref[array[Float]]): Int- Distance to querysymmetricDis(i: Int[64], j: Int[64], vd: ref[array[Float]]): Int- Symmetric distance
Distance metrics. Docs
METRIC_INNER_PRODUCT: 0- Inner product (maximum similarity)METRIC_L2: 1- Euclidean distance (L2 norm)METRIC_L1: 2- Manhattan distance (L1 norm)METRIC_LINF: 3- Infinity norm (Chebyshev distance)METRIC_LP: 4- Lp normMETRIC_CANBERRA: 20- Canberra distanceMETRIC_BRAY_CURTIS: 21- Bray-Curtis dissimilarityMETRIC_JENSEN_SHANNON: 22- Jensen-Shannon divergence
Return codes from C API functions.
OK: 0- SuccessUNKNOWN_EXCEPT: -1- Unknown exceptionFAISS_EXCEPT: -2- FAISS exceptionSTD_EXCEPT: -4- Standard library exception
getLastError(): CharsPtr- Get last error messagekmeansClustering(d: ArchWord, n: ArchWord, k: ArchWord, x: ref[array[Float]], centroids: ref[array[Float]], q_error: ref[Float]) Int- Standalone k-means
To enable GPU acceleration, set the environment variable before running:
export FAISS_USE_GPU=1The library will automatically load GPU-enabled binaries when available. See FAISS GPU documentation for details.
The Index.new factory method accepts strings to create different index types:
"Flat"- Exact search (brute force)"IVFn,Flat"- IVF with n centroids, flat encoding"IVFn,PQm"- IVF with n centroids, PQ with m subquantizers"HNSW32"- Hierarchical navigable small world with 32 neighbors"IVFn,HNSW32"- Combined IVF and HNSW
See the index factory documentation for all available options and combinations.
Complete working examples are in the Examples/ directory:
- example.alusus - Basic flat index with inner product search
- example2.alusus - IVF index with parameter tuning
-
Index Selection:
- Use
IndexFlatfor exact search on datasets <1M vectors - Use
IndexIVFfor approximate search on larger datasets - See the index selection guide
- Use
-
Training: IVF and other approximate indexes require training before adding vectors
-
nprobe Parameter: For IVF indexes, higher nprobe = better accuracy but slower search
-
GPU Acceleration: Enable GPU for operations on >10M vectors
-
Memory: Flat indexes store all vectors in memory; use compression for large datasets
See FAISS performance guidelines for detailed recommendations.
- FAISS GitHub: https://github.com/facebookresearch/faiss
- FAISS Wiki: https://github.com/facebookresearch/faiss/wiki
- Research Paper: Billion-scale similarity search with GPUs
- Alusus Language: https://alusus.org
Copyright (c) Facebook, Inc. and its affiliates. Copyright (c) Alusus Software Ltd. for the Alusus language bindings.
This binding follows the FAISS license (MIT). See the LICENSE file for details.