12536 字
63 分钟
Docker部署Milvus并使用Nginx Proxy Manager反代gRPC协议
本文主要信息本文介绍如何使用Docker部署Milvus(v2.5.19)向量数据库,并使用Nginx Proxy Manager反代gRPC协议,同时使用Cloudflare CDN功能。
为什么要这么麻烦?主要原因是我的甲骨文新加坡西服务器太闲了,想物尽其用。然后最近开始使用Claude Code进行Vibe Coding,发现一个项目叫claude-context。一直苦于这类AI Agent不能很好的获取上下文,这个项目应该是用了类似语义检索(虽然不是LSP)和向量检索的方式来实现的。然后这个项目使用的是Milvus作为向量数据库,所以就想把Milvus部署起来试试。
本文仅仅涉及Milvus的Standalone模式,集群部署过于复杂,目前暂时也用不上。以后用上了再更新部署教程吧…
组件说明
- Nginx Proxy Manager:反向代理 Milvus gRPC端口。
- Milvus:生产级向量数据库,支持高效的向量检索。
部署流程
部署Milvus
基本配置
- 创建对应文件夹和文件
Terminal window mkdir -p ~/docker_data/milvus && cd ~/docker_data/milvus - 下载Milvus的
milvus.yaml文件Terminal window wget https://raw.githubusercontent.com/milvus-io/milvus/v2.5.19/configs/milvus.yaml -O milvus.yaml - 修改
milvus.yaml文件,配置一些基础信息。nano milvus.yaml(使用ctrl+w进行搜索)milvus.yaml 91 collapsed lines# Licensed to the LF AI & Data foundation under one# or more contributor license agreements. See the NOTICE file# distributed with this work for additional information# regarding copyright ownership. The ASF licenses this file# to you under the Apache License, Version 2.0 (the# "License"); you may not use this file except in compliance# with the License. You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# Related configuration of etcd, used to store Milvus metadata & service discovery.etcd:# Endpoints used to access etcd service. You can change this parameter as the endpoints of your own etcd cluster.# Environment variable: ETCD_ENDPOINTS# etcd preferentially acquires valid address from environment variable ETCD_ENDPOINTS when Milvus is started.endpoints: localhost:2379# Root prefix of the key to where Milvus stores data in etcd.# It is recommended to change this parameter before starting Milvus for the first time.# To share an etcd instance among multiple Milvus instances, consider changing this to a different value for each Milvus instance before you start them.# Set an easy-to-identify root path for Milvus if etcd service already exists.# Changing this for an already running Milvus instance may result in failures to read legacy data.rootPath: by-dev# Sub-prefix of the key to where Milvus stores metadata-related information in etcd.# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.metaSubPath: meta# Sub-prefix of the key to where Milvus stores timestamps in etcd.# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended not to change this parameter if there is no specific reason.kvSubPath: kvlog:level: info # Only supports debug, info, warn, error, panic, or fatal. Default 'info'.# path is one of:# - "default" as os.Stderr,# - "stderr" as os.Stderr,# - "stdout" as os.Stdout,# - file path to append server logs to.# please adjust in embedded Milvus: /tmp/milvus/logs/etcd.logpath: stdoutssl:enabled: false # Whether to support ETCD secure connection modetlsCert: /path/to/etcd-client.pem # path to your cert filetlsKey: /path/to/etcd-client-key.pem # path to your key filetlsCACert: /path/to/ca.pem # path to your CACert file# TLS min version# Optional values: 1.0, 1.1, 1.2, 1.3。# We recommend using version 1.2 and above.tlsMinVersion: 1.3requestTimeout: 10000 # Etcd operation timeout in millisecondsuse:embed: false # Whether to enable embedded Etcd (an in-process EtcdServer).data:dir: default.etcd # Embedded Etcd only. please adjust in embedded Milvus: /tmp/milvus/etcdData/auth:enabled: false # Whether to enable authenticationuserName: # username for etcd authenticationpassword: # password for etcd authenticationmetastore:type: etcd # Default value: etcd, Valid values: [etcd, tikv]snapshot:ttl: 86400 # snapshot ttl in secondsreserveTime: 3600 # snapshot reserve time in seconds# Related configuration of tikv, used to store Milvus metadata.# Notice that when TiKV is enabled for metastore, you still need to have etcd for service discovery.# TiKV is a good option when the metadata size requires better horizontal scalability.tikv:endpoints: 127.0.0.1:2389 # Note that the default pd port of tikv is 2379, which conflicts with etcd.rootPath: by-dev # The root path where data is stored in tikvmetaSubPath: meta # metaRootPath = rootPath + '/' + metaSubPathkvSubPath: kv # kvRootPath = rootPath + '/' + kvSubPathrequestTimeout: 10000 # ms, tikv request timeoutsnapshotScanSize: 256 # batch size of tikv snapshot scanssl:enabled: false # Whether to support TiKV secure connection modetlsCert: # path to your cert filetlsKey: # path to your key filetlsCACert: # path to your CACert filelocalStorage:# Local path to where vector data are stored during a search or a query to avoid repetitve access to MinIO or S3 service.# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.path: /var/lib/milvus/data/# Related configuration of MinIO/S3/GCS or any other service supports S3 API, which is responsible for data persistence for Milvus.# We refer to the storage service as MinIO/S3 in the following description for simplicity.minio:# IP address of MinIO or S3 service.# Environment variable: MINIO_ADDRESS# minio.address and minio.port together generate the valid access to MinIO or S3 service.# MinIO preferentially acquires the valid IP address from the environment variable MINIO_ADDRESS when Milvus is started.# Default value applies when MinIO or S3 is running on the same network with Milvus.address: 172.17.0.1port: 65010 # Port of MinIO or S3 service.# Access key ID that MinIO or S3 issues to user for authorized access.# Environment variable: MINIO_ACCESS_KEY_ID or minio.accessKeyID# minio.accessKeyID and minio.secretAccessKey together are used for identity authentication to access the MinIO or S3 service.# This configuration must be set identical to the environment variable MINIO_ACCESS_KEY_ID, which is necessary for starting MinIO or S3.# The default value applies to MinIO or S3 service that started with the default docker-compose.yml file.accessKeyID: xxxxx# Secret key used to encrypt the signature string and verify the signature string on server. It must be kept strictly confidential and accessible only to the MinIO or S3 server and users.# Environment variable: MINIO_SECRET_ACCESS_KEY or minio.secretAccessKey# minio.accessKeyID and minio.secretAccessKey together are used for identity authentication to access the MinIO or S3 service.# This configuration must be set identical to the environment variable MINIO_SECRET_ACCESS_KEY, which is necessary for starting MinIO or S3.# The default value applies to MinIO or S3 service that started with the default docker-compose.yml file.secretAccessKey: xxxxxxuseSSL: false # Switch value to control if to access the MinIO or S3 service through SSL.ssl:tlsCACert: /path/to/public.crt # path to your CACert file# Name of the bucket where Milvus stores data in MinIO or S3.# Milvus 2.0.0 does not support storing data in multiple buckets.# Bucket with this name will be created if it does not exist. If the bucket already exists and is accessible, it will be used directly. Otherwise, there will be an error.# To share an MinIO instance among multiple Milvus instances, consider changing this to a different value for each Milvus instance before you start them. For details, see Operation FAQs.# The data will be stored in the local Docker if Docker is used to start the MinIO service locally. Ensure that there is sufficient storage space.# A bucket name is globally unique in one MinIO or S3 instance.bucketName: milvus# Root prefix of the key to where Milvus stores data in MinIO or S3.# It is recommended to change this parameter before starting Milvus for the first time.# To share an MinIO instance among multiple Milvus instances, consider changing this to a different value for each Milvus instance before you start them. For details, see Operation FAQs.# Set an easy-to-identify root key prefix for Milvus if etcd service already exists.# Changing this for an already running Milvus instance may result in failures to read legacy data.rootPath: files# Whether to useIAM role to access S3/GCS instead of access/secret keys# For more information, refer to# aws: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html# gcp: https://cloud.google.com/storage/docs/access-control/iam# aliyun (ack): https://www.alibabacloud.com/help/en/container-service-for-kubernetes/latest/use-rrsa-to-enforce-access-control# aliyun (ecs): https://www.alibabacloud.com/help/en/elastic-compute-service/latest/attach-an-instance-ram-roleuseIAM: false# Cloud Provider of S3. Supports: "aws", "gcp", "aliyun".# Cloud Provider of Google Cloud Storage. Supports: "gcpnative".# You can use "aws" for other cloud provider supports S3 API with signature v4, e.g.: minio# You can use "gcp" for other cloud provider supports S3 API with signature v2# You can use "aliyun" for other cloud provider uses virtual host style bucket# You can use "gcpnative" for the Google Cloud Platform provider. Uses service account credentials# for authentication.# When useIAM enabled, only "aws", "gcp", "aliyun" is supported for nowcloudProvider: aws# The JSON content contains the gcs service account credentials.# Used only for the "gcpnative" cloud provider.gcpCredentialJSON:# Custom endpoint for fetch IAM role credentials. when useIAM is true & cloudProvider is "aws".# Leave it empty if you want to use AWS default endpointiamEndpoint:logLevel: fatal # Log level for aws sdk log. Supported level: off, fatal, error, warn, info, debug, traceregion: # Specify minio storage system location regionuseVirtualHost: false # Whether use virtual host mode for bucketrequestTimeoutMs: 10000 # minio timeout for request time in milliseconds# The maximum number of objects requested per batch in minio ListObjects rpc,# 0 means using oss client by default, decrease these configration if ListObjects timeoutlistObjectsMaxKeys: 0731 collapsed lines# Milvus supports four MQ: rocksmq(based on RockDB), natsmq(embedded nats-server), Pulsar and Kafka.# You can change your mq by setting mq.type field.# If you don't set mq.type field as default, there is a note about enabling priority if we config multiple mq in this file.# 1. standalone(local) mode: rocksmq(default) > natsmq > Pulsar > Kafka# 2. cluster mode: Pulsar(default) > Kafka (rocksmq and natsmq is unsupported in cluster mode)mq:# Default value: "default"# Valid values: [default, pulsar, kafka, rocksmq, natsmq]type: defaultenablePursuitMode: true # Default value: "true"pursuitLag: 10 # time tick lag threshold to enter pursuit mode, in secondspursuitBufferSize: 8388608 # pursuit mode buffer size in bytespursuitBufferTime: 60 # pursuit mode buffer time in secondsmqBufSize: 16 # MQ client consumer buffer lengthdispatcher:mergeCheckInterval: 0.1 # the interval time(in seconds) for dispatcher to check whether to mergetargetBufSize: 16 # the lenth of channel buffer for targemaxTolerantLag: 3 # Default value: "3", the timeout(in seconds) that target sends msgPack# Related configuration of pulsar, used to manage Milvus logs of recent mutation operations, output streaming log, and provide log publish-subscribe services.pulsar:# IP address of Pulsar service.# Environment variable: PULSAR_ADDRESS# pulsar.address and pulsar.port together generate the valid access to Pulsar.# Pulsar preferentially acquires the valid IP address from the environment variable PULSAR_ADDRESS when Milvus is started.# Default value applies when Pulsar is running on the same network with Milvus.address: localhostport: 6650 # Port of Pulsar service.webport: 80 # Web port of of Pulsar service. If you connect direcly without proxy, should use 8080.# The maximum size of each message in Pulsar. Unit: Byte.# By default, Pulsar can transmit at most 2MB of data in a single message. When the size of inserted data is greater than this value, proxy fragments the data into multiple messages to ensure that they can be transmitted correctly.# If the corresponding parameter in Pulsar remains unchanged, increasing this configuration will cause Milvus to fail, and reducing it produces no advantage.maxMessageSize: 2097152# Pulsar can be provisioned for specific tenants with appropriate capacity allocated to the tenant.# To share a Pulsar instance among multiple Milvus instances, you can change this to an Pulsar tenant rather than the default one for each Milvus instance before you start them. However, if you do not want Pulsar multi-tenancy, you are advised to change msgChannel.chanNamePrefix.cluster to the different value.tenant: publicnamespace: default # A Pulsar namespace is the administrative unit nomenclature within a tenant.requestTimeout: 60 # pulsar client global request timeout in secondsenableClientMetrics: false # Whether to register pulsar client metrics into milvus metrics path.# If you want to enable kafka, needs to comment the pulsar configs# kafka:# brokerList: localhost:9092# saslUsername:# saslPassword:# saslMechanisms:# securityProtocol:# ssl:# enabled: false # whether to enable ssl mode# tlsCert: # path to client's public key (PEM) used for authentication# tlsKey: # path to client's private key (PEM) used for authentication# tlsCaCert: # file or directory path to CA certificate(s) for verifying the broker's key# tlsKeyPassword: # private key passphrase for use with ssl.key.location and set_ssl_cert(), if any# readTimeout: 10rocksmq:# Prefix of the key to where Milvus stores data in RocksMQ.# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.# Set an easy-to-identify root key prefix for Milvus if etcd service already exists.path: /var/lib/milvus/rdb_datalrucacheratio: 0.06 # rocksdb cache memory ratiorocksmqPageSize: 67108864 # The maximum size of messages in each page in RocksMQ. Messages in RocksMQ are checked and cleared (when expired) in batch based on this parameters. Unit: Byte.retentionTimeInMinutes: 4320 # The maximum retention time of acked messages in RocksMQ. Acked messages in RocksMQ are retained for the specified period of time and then cleared. Unit: Minute.retentionSizeInMB: 8192 # The maximum retention size of acked messages of each topic in RocksMQ. Acked messages in each topic are cleared if their size exceed this parameter. Unit: MB.compactionInterval: 86400 # Time interval to trigger rocksdb compaction to remove deleted data. Unit: SecondcompressionTypes: 0,0,7,7,7 # compaction compression type, only support use 0,7. 0 means not compress, 7 will use zstd. Length of types means num of rocksdb level.# natsmq configuration.# more detail: https://docs.nats.io/running-a-nats-service/configurationnatsmq:server:port: 4222 # Listening port of the NATS server.storeDir: /var/lib/milvus/nats # Directory to use for JetStream storage of natsmaxFileStore: 17179869184 # Maximum size of the 'file' storagemaxPayload: 8388608 # Maximum number of bytes in a message payloadmaxPending: 67108864 # Maximum number of bytes buffered for a connection Applies to client connectionsinitializeTimeout: 4000 # waiting for initialization of natsmq finishedmonitor:trace: false # If true enable protocol trace log messagesdebug: false # If true enable debug log messageslogTime: true # If set to false, log without timestamps.logFile: /tmp/milvus/logs/nats.log # Log file path relative to .. of milvus binary if use relative pathlogSizeLimit: 536870912 # Size in bytes after the log file rolls over to a new oneretention:maxAge: 4320 # Maximum age of any message in the P-channelmaxBytes: # How many bytes the single P-channel may contain. Removing oldest messages if the P-channel exceeds this sizemaxMsgs: # How many message the single P-channel may contain. Removing oldest messages if the P-channel exceeds this limit# Related configuration of rootCoord, used to handle data definition language (DDL) and data control language (DCL) requestsrootCoord:dmlChannelNum: 16 # The number of DML-Channels to create at the root coord startup.# The maximum number of partitions in each collection.# New partitions cannot be created if this parameter is set as 0 or 1.# Range: [0, INT64MAX]maxPartitionNum: 1024# The minimum row count of a segment required for creating index.# Segments with smaller size than this parameter will not be indexed, and will be searched with brute force.minSegmentSizeToEnableIndex: 1024enableActiveStandby: falsemaxDatabaseNum: 64 # Maximum number of databasemaxGeneralCapacity: 65536 # upper limit for the sum of of product of partitionNumber and shardNumbergracefulStopTimeout: 5 # seconds. force stop node without graceful stopip: # TCP/IP address of rootCoord. If not specified, use the first unicastable addressport: 53100 # TCP port of rootCoordgrpc:serverMaxSendSize: 536870912 # The maximum size of each RPC request that the rootCoord can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the rootCoord can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on rootCoord can send, unit: byteclientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on rootCoord can receive, unit: byte# Related configuration of proxy, used to validate client requests and reduce the returned results.proxy:timeTickInterval: 200 # The interval at which proxy synchronizes the time tick, unit: ms.healthCheckTimeout: 3000 # ms, the interval that to do component healthy checkmsgStream:timeTick:bufSize: 512 # The maximum number of messages can be buffered in the timeTick message stream of the proxy when producing messages.maxNameLength: 255 # The maximum length of the name or alias that can be created in Milvus, including the collection name, collection alias, partition name, and field name.maxFieldNum: 64 # The maximum number of field can be created when creating in a collection. It is strongly DISCOURAGED to set maxFieldNum >= 64.maxVectorFieldNum: 4 # The maximum number of vector fields that can be specified in a collection. Value range: [1, 10].maxShardNum: 16 # The maximum number of shards can be created when creating in a collection.maxDimension: 32768 # The maximum number of dimensions of a vector can have when creating in a collection.# Whether to produce gin logs.\n# please adjust in embedded Milvus: falseginLogging: trueginLogSkipPaths: / # skip url path for gin logmaxTaskNum: 1024 # The maximum number of tasks in the task queue of the proxy.ddlConcurrency: 16 # The concurrent execution number of DDL at proxy.dclConcurrency: 16 # The concurrent execution number of DCL at proxy.mustUsePartitionKey: false # switch for whether proxy must use partition key for the collection# maximum number of result entries, typically Nq * TopK * GroupSize.# It costs additional memory and time to process a large number of result entries.# If the number of result entries exceeds this limit, the search will be rejected.# Disabled if the value is less or equal to 0.maxResultEntries: -1accessLog:enable: false # Whether to enable the access log feature.minioEnable: false # Whether to upload local access log files to MinIO. This parameter can be specified when proxy.accessLog.filename is not empty.localPath: /tmp/milvus_access # The local folder path where the access log file is stored. This parameter can be specified when proxy.accessLog.filename is not empty.filename: # The name of the access log file. If you leave this parameter empty, access logs will be printed to stdout.maxSize: 64 # The maximum size allowed for a single access log file. If the log file size reaches this limit, a rotation process will be triggered. This process seals the current access log file, creates a new log file, and clears the contents of the original log file. Unit: MB.rotatedTime: 0 # The maximum time interval allowed for rotating a single access log file. Upon reaching the specified time interval, a rotation process is triggered, resulting in the creation of a new access log file and sealing of the previous one. Unit: secondsremotePath: access_log/ # The path of the object storage for uploading access log files.remoteMaxTime: 0 # The time interval allowed for uploading access log files. If the upload time of a log file exceeds this interval, the file will be deleted. Setting the value to 0 disables this feature.formatters:base:format: "[$time_now] [ACCESS] <$user_name: $user_addr> $method_name [status: $method_status] [code: $error_code] [sdk: $sdk_version] [msg: $error_msg] [traceID: $trace_id] [timeCost: $time_cost]"query:format: "[$time_now] [ACCESS] <$user_name: $user_addr> $method_name [status: $method_status] [code: $error_code] [sdk: $sdk_version] [msg: $error_msg] [traceID: $trace_id] [timeCost: $time_cost] [database: $database_name] [collection: $collection_name] [partitions: $partition_name] [expr: $method_expr] [params: $query_params]"methods: "Query, Delete"search:format: "[$time_now] [ACCESS] <$user_name: $user_addr> $method_name [status: $method_status] [code: $error_code] [sdk: $sdk_version] [msg: $error_msg] [traceID: $trace_id] [timeCost: $time_cost] [database: $database_name] [collection: $collection_name] [partitions: $partition_name] [expr: $method_expr] [nq: $nq] [params: $search_params]"methods: "HybridSearch, Search"cacheSize: 0 # Size of log of write cache, in byte. (Close write cache if size was 0)cacheFlushInterval: 3 # time interval of auto flush write cache, in seconds. (Close auto flush if interval was 0)connectionCheckIntervalSeconds: 120 # the interval time(in seconds) for connection manager to scan inactive client infoconnectionClientInfoTTLSeconds: 86400 # inactive client info TTL duration, in secondsmaxConnectionNum: 10000 # the max client info numbers that proxy should manage, avoid too many client infosgracefulStopTimeout: 30 # seconds. force stop node without graceful stopslowQuerySpanInSeconds: 5 # query whose executed time exceeds the `slowQuerySpanInSeconds` can be considered slow, in seconds.queryNodePooling:size: 10 # the size for shardleader(querynode) client poolhttp:enabled: true # Whether to enable the http serverdebug_mode: false # Whether to enable http server debug modeport: # high-level restful apiacceptTypeAllowInt64: true # high-level restful api, whether http client can deal with int64enablePprof: true # Whether to enable pprof middleware on the metrics portenableWebUI: true # Whether to enable setting the WebUI middleware on the metrics portip: # TCP/IP address of proxy. If not specified, use the first unicastable addressport: 19530 # TCP port of proxyinternalPort: 19529grpc:serverMaxSendSize: 268435456 # The maximum size of each RPC request that the proxy can send, unit: byteserverMaxRecvSize: 67108864 # The maximum size of each RPC request that the proxy can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on proxy can send, unit: byteclientMaxRecvSize: 67108864 # The maximum size of each RPC request that the clients on proxy can receive, unit: byte# Related configuration of queryCoord, used to manage topology and load balancing for the query nodes, and handoff from growing segments to sealed segments.queryCoord:taskMergeCap: 1taskExecutionCap: 256# Switch value to control if to automatically replace a growing segment with the corresponding indexed sealed segment when the growing segment reaches the sealing threshold.# If this parameter is set false, Milvus simply searches the growing segments with brute force.autoHandoff: trueautoBalance: true # Switch value to control if to automatically balance the memory usage among query nodes by distributing segment loading and releasing operations evenly.autoBalanceChannel: true # Enable auto balance channelbalancer: ScoreBasedBalancer # auto balancer used for segments on queryNodesglobalRowCountFactor: 0.1 # the weight used when balancing segments among queryNodesscoreUnbalanceTolerationFactor: 0.05 # the least value for unbalanced extent between from and to nodes when doing balancereverseUnBalanceTolerationFactor: 1.3 # the largest value for unbalanced extent between from and to nodes after doing balanceoverloadedMemoryThresholdPercentage: 90 # The threshold of memory usage (in percentage) in a query node to trigger the sealed segment balancing.balanceIntervalSeconds: 60 # The interval at which query coord balances the memory usage among query nodes.memoryUsageMaxDifferencePercentage: 30 # The threshold of memory usage difference (in percentage) between any two query nodes to trigger the sealed segment balancing.rowCountFactor: 0.4 # the row count weight used when balancing segments among queryNodessegmentCountFactor: 0.4 # the segment count weight used when balancing segments among queryNodesglobalSegmentCountFactor: 0.1 # the segment count weight used when balancing segments among queryNodes# the channel count weight used when balancing channels among queryNodes,# A higher value reduces the likelihood of assigning channels from the same collection to the same QueryNode. Set to 1 to disable this feature.collectionChannelCountFactor: 10segmentCountMaxSteps: 50 # segment count based plan generator max stepsrowCountMaxSteps: 50 # segment count based plan generator max stepsrandomMaxSteps: 10 # segment count based plan generator max stepsgrowingRowCountWeight: 4 # the memory weight of growing segment row countdelegatorMemoryOverloadFactor: 0.1 # the factor of delegator overloaded memorybalanceCostThreshold: 0.001 # the threshold of balance cost, if the difference of cluster's cost after executing the balance plan is less than this value, the plan will not be executedcheckSegmentInterval: 1000checkChannelInterval: 1000checkBalanceInterval: 300autoBalanceInterval: 3000 # the interval for triggerauto balancecheckIndexInterval: 10000channelTaskTimeout: 60000 # 1 minutesegmentTaskTimeout: 120000 # 2 minutedistPullInterval: 500heartbeatAvailableInterval: 10000 # 10s, Only QueryNodes which fetched heartbeats within the duration are availableloadTimeoutSeconds: 600distRequestTimeout: 5000 # the request timeout for querycoord fetching data distribution from querynodes, in millisecondsheatbeatWarningLag: 5000 # the lag value for querycoord report warning when last heatbeat is too old, in millisecondscheckHandoffInterval: 5000enableActiveStandby: falsecheckInterval: 1000checkHealthInterval: 3000 # 3s, the interval when query coord try to check health of query nodecheckHealthRPCTimeout: 2000 # 100ms, the timeout of check health rpc to query nodebrokerTimeout: 5000 # 5000ms, querycoord broker rpc timeoutcollectionRecoverTimes: 3 # if collection recover times reach the limit during loading state, release itobserverTaskParallel: 16 # the parallel observer dispatcher task numbercheckAutoBalanceConfigInterval: 10 # the interval of check auto balance configcheckNodeSessionInterval: 60 # the interval(in seconds) of check querynode cluster sessiongracefulStopTimeout: 5 # seconds. force stop node without graceful stopenableStoppingBalance: true # whether enable stopping balancechannelExclusiveNodeFactor: 4 # the least node number for enable channel's exclusive modecollectionObserverInterval: 200 # the interval of collection observercheckExecutedFlagInterval: 100 # the interval of check executed flag to force to pull distupdateCollectionLoadStatusInterval: 5 # 5m, max interval of updating collection loaded status for check healthcleanExcludeSegmentInterval: 60 # the time duration of clean pipeline exclude segment which used for filter invalid data, in secondsip: # TCP/IP address of queryCoord. If not specified, use the first unicastable addressport: 19531 # TCP port of queryCoordgrpc:serverMaxSendSize: 536870912 # The maximum size of each RPC request that the queryCoord can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the queryCoord can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on queryCoord can send, unit: byteclientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on queryCoord can receive, unit: byte# Related configuration of queryNode, used to run hybrid search between vector and scalar data.queryNode:stats:publishInterval: 1000 # The interval that query node publishes the node statistics information, including segment status, cpu usage, memory usage, health status, etc. Unit: ms.segcore:knowhereThreadPoolNumRatio: 4 # The number of threads in knowhere's thread pool. If disk is enabled, the pool size will multiply with knowhereThreadPoolNumRatio([1, 32]).chunkRows: 128 # Row count by which Segcore divides a segment into chunks.interimIndex:# Whether to create a temporary index for growing segments and sealed segments not yet indexed, improving search performance.# Milvus will eventually seals and indexes all segments, but enabling this optimizes search performance for immediate queries following data insertion.# This defaults to true, indicating that Milvus creates temporary index for growing segments and the sealed segments that are not indexed upon searches.enableIndex: truenlist: 128 # interim index nlist, recommend to set sqrt(chunkRows), must smaller than chunkRows/8nprobe: 16 # nprobe to search small index, based on your accuracy requirement, must smaller than nlistsubDim: 4 # interim index sub dim, recommend to (subDim % vector dim == 0)refineRatio: 4.5 # interim index parameters, should set to be >= 1.0indexBuildRatio: 0.1 # the ratio of building interim index rows count with max row count of a flush segment, should set to be < 1.0refineQuantType: NONE # Data representation of SCANN_DVR index, options: 'NONE', 'FLOAT16', 'BFLOAT16' and 'UINT8'refineWithQuant: true # whether to use refineQuantType to refine for faster but loss a little precisiondenseVectorIndexType: IVF_FLAT_CC # Dense vector intermin index typememExpansionRate: 1.15 # extra memory needed by building interim indexbuildParallelRate: 0.5 # the ratio of building interim index parallel matched with cpu nummultipleChunkedEnable: true # Enable multiple chunked searchdeleteDumpBatchSize: 10000 # Batch size for delete snapshot dump in segcore.knowhereScoreConsistency: false # Enable knowhere strong consistency score computation logicjsonKeyStatsCommitInterval: 200 # the commit interval for the JSON key Stats to commitloadMemoryUsageFactor: 1 # The multiply factor of calculating the memory usage while loading segmentsenableDisk: false # enable querynode load disk index, and search on disk indexmaxDiskUsagePercentage: 95cache:memoryLimit: 2147483648 # 2 GB, 2 * 1024 *1024 *1024readAheadPolicy: willneed # The read ahead policy of chunk cache, options: `normal, random, sequential, willneed, dontneed`# options: async, sync, disable.# Specifies the necessity for warming up the chunk cache.# 1. If set to "sync" or "async" the original vector data will be synchronously/asynchronously loaded into the# chunk cache during the load process. This approach has the potential to substantially reduce query/search latency# for a specific duration post-load, albeit accompanied by a concurrent increase in disk usage;# 2. If set to "disable" original vector data will only be loaded into the chunk cache during search/query.warmup: disablemmap:vectorField: false # Enable mmap for loading vector datavectorIndex: false # Enable mmap for loading vector indexscalarField: false # Enable mmap for loading scalar datascalarIndex: false # Enable mmap for loading scalar indexchunkCache: true # Enable mmap for chunk cache (raw vector retrieving).# Enable memory mapping (mmap) to optimize the handling of growing raw data.# By activating this feature, the memory overhead associated with newly added or modified data will be significantly minimized.# However, this optimization may come at the cost of a slight decrease in query latency for the affected data segments.growingMmapEnabled: falsefixedFileSizeForMmapAlloc: 1 # tmp file size for mmap chunk managermaxDiskUsagePercentageForMmapAlloc: 50 # disk percentage used in mmap chunk managerlazyload:enabled: false # Enable lazyload for loading datawaitTimeout: 30000 # max wait timeout duration in milliseconds before start to do lazyload search and retrieverequestResourceTimeout: 5000 # max timeout in milliseconds for waiting request resource for lazy load, 5s by defaultrequestResourceRetryInterval: 2000 # retry interval in milliseconds for waiting request resource for lazy load, 2s by defaultmaxRetryTimes: 1 # max retry times for lazy load, 1 by defaultmaxEvictPerRetry: 1 # max evict count for lazy load, 1 by defaultindexOffsetCacheEnabled: false # enable index offset cache for some scalar indexes, now is just for bitmap index, enable this param can improve performance for retrieving raw data from indexgrouping:enabled: truemaxNQ: 1000topKMergeRatio: 20scheduler:receiveChanSize: 10240unsolvedQueueSize: 10240# maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).# Max read concurrency would be the value of hardware.GetCPUNum * maxReadConcurrentRatio.# It defaults to 2.0, which means max read concurrency would be the value of hardware.GetCPUNum * 2.# Max read concurrency must greater than or equal to 1, and less than or equal to hardware.GetCPUNum * 100.# (0, 100]maxReadConcurrentRatio: 1cpuRatio: 10 # ratio used to estimate read task cpu usage.maxTimestampLag: 86400scheduleReadPolicy:# fifo: A FIFO queue support the schedule.# user-task-polling:# The user's tasks will be polled one by one and scheduled.# Scheduling is fair on task granularity.# The policy is based on the username for authentication.# And an empty username is considered the same user.# When there are no multi-users, the policy decay into FIFO"name: fifotaskQueueExpire: 60 # Control how long (many seconds) that queue retains since queue is emptyenableCrossUserGrouping: false # Enable Cross user grouping when using user-task-polling policy. (Disable it if user's task can not merge each other)maxPendingTaskPerUser: 1024 # Max pending task per user in schedulerlevelZeroForwardPolicy: FilterByBF # delegator level zero deletion forward policy, possible option["FilterByBF", "RemoteLoad"]streamingDeltaForwardPolicy: FilterByBF # delegator streaming deletion forward policy, possible option["FilterByBF", "Direct"]forwardBatchSize: 4194304 # the batch size delegator uses for forwarding stream delete in loading procedureexprCache:enabled: false # enable expression result cachecapacityBytes: 268435456 # max capacity in bytes for expression result cachedataSync:flowGraph:maxQueueLength: 16 # The maximum size of task queue cache in flow graph in query node.maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraphenableSegmentPrune: false # use partition stats to prune data in search/query on shard delegatorqueryStreamBatchSize: 4194304 # return min batch size of stream queryqueryStreamMaxBatchSize: 134217728 # return max batch size of stream querybloomFilterApplyParallelFactor: 4 # parallel factor when to apply pk to bloom filter, default to 4*CPU_CORE_NUMworkerPooling:size: 10 # the size for worker querynode client poolidfOracle:enableDisk: truewriteConcurrency: 4ip: # TCP/IP address of queryNode. If not specified, use the first unicastable addressport: 21123 # TCP port of queryNodegrpc:serverMaxSendSize: 536870912 # The maximum size of each RPC request that the queryNode can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the queryNode can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on queryNode can send, unit: byteclientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on queryNode can receive, unit: byteindexCoord:bindIndexNodeMode:enable: falseaddress: localhost:22930withCred: falsenodeID: 0segment:minSegmentNumRowsToEnableIndex: 1024 # It's a threshold. When the segment num rows is less than this value, the segment will not be indexedindexNode:scheduler:buildParallel: 1ip: # TCP/IP address of indexNode. If not specified, use the first unicastable addressport: 21121 # TCP port of indexNodegrpc:serverMaxSendSize: 536870912 # The maximum size of each RPC request that the indexNode can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the indexNode can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on indexNode can send, unit: byteclientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on indexNode can receive, unit: bytedataCoord:channel:watchTimeoutInterval: 300 # Timeout on watching channels (in seconds). Datanode tickler update watch progress will reset timeout timer.legacyVersionWithoutRPCWatch: 2.4.1 # Datanodes <= this version are considered as legacy nodes, which doesn't have rpc based watch(). This is only used during rolling upgrade where legacy nodes won't get new channelsbalanceSilentDuration: 300 # The duration after which the channel manager start background channel balancingbalanceInterval: 360 # The interval with which the channel manager check dml channel balance statuscheckInterval: 1 # The interval in seconds with which the channel manager advances channel statesnotifyChannelOperationTimeout: 5 # Timeout notifing channel operations (in seconds).segment:maxSize: 1024 # The maximum size of a segment, unit: MB. datacoord.segment.maxSize and datacoord.segment.sealProportion together determine if a segment can be sealed.diskSegmentMaxSize: 2048 # Maximum size of a segment in MB for collection which has Disk indexsealProportion: 0.12 # The minimum proportion to datacoord.segment.maxSize to seal a segment. datacoord.segment.maxSize and datacoord.segment.sealProportion together determine if a segment can be sealed.sealProportionJitter: 0.1 # segment seal proportion jitter ratio, default value 0.1(10%), if seal proportion is 12%, with jitter=0.1, the actuall applied ratio will be 10.8~12%assignmentExpiration: 2000 # Expiration time of the segment assignment, unit: msallocLatestExpireAttempt: 200 # The time attempting to alloc latest lastExpire from rootCoord after restartmaxLife: 86400 # The max lifetime of segment in seconds, 24*60*60# If a segment didn't accept dml records in maxIdleTime and the size of segment is greater than# minSizeFromIdleToSealed, Milvus will automatically seal it.# The max idle time of segment in seconds, 10*60.maxIdleTime: 600minSizeFromIdleToSealed: 16 # The min size in MB of segment which can be idle from sealed.# The max number of binlog (which is equal to the binlog file num of primary key) for one segment,# the segment will be sealed if the number of binlog file reaches to max value.maxBinlogFileNumber: 32smallProportion: 0.5 # The segment is considered as "small segment" when its # of rows is smaller than# (smallProportion * segment max # of rows).# A compaction will happen on small segments if the segment after compaction will havecompactableProportion: 0.85# over (compactableProportion * segment max # of rows) rows.# MUST BE GREATER THAN OR EQUAL TO <smallProportion>!!!# During compaction, the size of segment # of rows is able to exceed segment max # of rows by (expansionRate-1) * 100%.expansionRate: 1.25sealPolicy:channel:# The size threshold in MB, if the total size of growing segments of each shard# exceeds this threshold, the largest growing segment will be sealed.growingSegmentsMemSize: 4096# If the total entry number of l0 logs of each shard# exceeds this threshold, the earliest growing segments will be sealed.blockingL0EntryNum: 5000000# The size threshold in MB, if the total entry number of l0 logs of each shard# exceeds this threshold, the earliest growing segments will be sealed.blockingL0SizeInMB: 64autoUpgradeSegmentIndex: false # whether auto upgrade segment index to index engine's versionforceRebuildSegmentIndex: false # force rebuild segment index to specify index engine's version# if param forceRebuildSegmentIndex is enabled, the vector index will be rebuilt to aligned with targetVecIndexVersion.# if param forceRebuildSegmentIndex is not enabled, the newly created vector index will be aligned with the newer one of index engine's version and targetVecIndexVersion.# if param targetVecIndexVersion is not set, the default value is -1, which means no target vec index version, then the vector index will be aligned with index engine's versiontargetVecIndexVersion: -1segmentFlushInterval: 2 # the minimal interval duration(unit: Seconds) between flushing operation on same segment# Switch value to control if to enable segment compaction.# Compaction merges small-size segments into a large segment, and clears the entities deleted beyond the rentention duration of Time Travel.enableCompaction: truecompaction:# Switch value to control if to enable automatic segment compaction during which data coord locates and merges compactable segments in the background.# This configuration takes effect only when dataCoord.enableCompaction is set as true.enableAutoCompaction: trueindexBasedCompaction: true# compaction task prioritizer, options: [default, level, mix].# default is FIFO.# level is prioritized by level: L0 compactions first, then mix compactions, then clustering compactions.# mix is prioritized by level: mix compactions first, then L0 compactions, then clustering compactions.taskPrioritizer: defaulttaskQueueCapacity: 100000 # compaction task queue sizerpcTimeout: 10maxParallelTaskNum: 10dropTolerance: 86400 # Compaction task will be cleaned after finish longer than this time(in seconds)gcInterval: 1800 # The time interval in seconds for compaction gcscheduleInterval: 500 # The time interval in milliseconds for scheduling compaction tasks. If the configuration setting is below 100ms, it will be adjusted upwards to 100msmix:triggerInterval: 60 # The time interval in seconds to trigger mix compactionlevelzero:triggerInterval: 10 # The time interval in seconds for trigger L0 compactionforceTrigger:minSize: 8388608 # The minimum size in bytes to force trigger a LevelZero Compaction, default as 8MBmaxSize: 67108864 # The maxmum size in bytes to force trigger a LevelZero Compaction, default as 64MBdeltalogMinNum: 10 # The minimum number of deltalog files to force trigger a LevelZero CompactiondeltalogMaxNum: 30 # The maxmum number of deltalog files to force trigger a LevelZero Compaction, default as 30expiry:tolerance: -1 # tolerant duration in hours for expiry data, negative value means disable force expiry compactionsingle:ratio:threshold: 0.2 # The ratio threshold of a segment to trigger a single compaction, default as 0.2deltalog:maxsize: 16777216 # The deltalog size of a segment to trigger a single compaction, default as 16MBmaxnum: 200 # The deltalog count of a segment to trigger a compaction, default as 200expiredlog:maxsize: 10485760 # The expired log size of a segment to trigger a compaction, default as 10MBclustering:enable: true # Enable clustering compactionautoEnable: false # Enable auto clustering compactiontriggerInterval: 600 # clustering compaction trigger interval in secondsminInterval: 3600 # The minimum interval between clustering compaction executions of one collection, to avoid redundant compactionmaxInterval: 259200 # If a collection haven't been clustering compacted for longer than maxInterval, force compactnewDataSizeThreshold: 512m # If new data size is large than newDataSizeThreshold, execute clustering compactionpreferSegmentSizeRatio: 0.8maxSegmentSizeRatio: 1maxTrainSizeRatio: 0.8 # max data size ratio in Kmeans train, if larger than it, will down sampling to meet this limitmaxCentroidsNum: 10240 # maximum centroids number in Kmeans trainminCentroidsNum: 16 # minimum centroids number in Kmeans trainminClusterSizeRatio: 0.01 # minimum cluster size / avg size in Kmeans trainmaxClusterSizeRatio: 10 # maximum cluster size / avg size in Kmeans trainmaxClusterSize: 5g # maximum cluster size in Kmeans trainsyncSegmentsInterval: 300 # The time interval for regularly syncing segmentsindex:memSizeEstimateMultiplier: 2 # When the memory size is not setup by index procedure, multiplier to estimate the memory size of index dataenableGarbageCollection: true # Switch value to control if to enable garbage collection to clear the discarded data in MinIO or S3 service.gc:interval: 3600 # The interval at which data coord performs garbage collection, unit: second.missingTolerance: 86400 # The retention duration of the unrecorded binary log (binlog) files. Setting a reasonably large value for this parameter avoids erroneously deleting the newly created binlog files that lack metadata. Unit: second.dropTolerance: 10800 # The retention duration of the binlog files of the deleted segments before they are cleared, unit: second.removeConcurrent: 32 # number of concurrent goroutines to remove dropped s3 objectsscanInterval: 168 # orphan file (file on oss but has not been registered on meta) on object storage garbage collection scanning interval in hoursenableActiveStandby: falsebrokerTimeout: 5000 # 5000ms, dataCoord broker rpc timeoutautoBalance: true # Enable auto balancecheckAutoBalanceConfigInterval: 10 # the interval of check auto balance configimport:filesPerPreImportTask: 2 # The maximum number of files allowed per pre-import task.taskRetention: 10800 # The retention period in seconds for tasks in the Completed or Failed state.maxSizeInMBPerImportTask: 6144 # To prevent generating of small segments, we will re-group imported files. This parameter represents the sum of file sizes in each group (each ImportTask).scheduleInterval: 2 # The interval for scheduling import, measured in seconds.checkIntervalHigh: 2 # The interval for checking import, measured in seconds, is set to a high frequency for the import checker.checkIntervalLow: 120 # The interval for checking import, measured in seconds, is set to a low frequency for the import checker.maxImportFileNumPerReq: 1024 # The maximum number of files allowed per single import request.maxImportJobNum: 1024 # Maximum number of import jobs that are executing or pending.waitForIndex: true # Indicates whether the import operation waits for the completion of index building.gracefulStopTimeout: 5 # seconds. force stop node without graceful stopslot:clusteringCompactionUsage: 16 # slot usage of clustering compaction job.mixCompactionUsage: 8 # slot usage of mix compaction job.l0DeleteCompactionUsage: 8 # slot usage of l0 compaction job.indexTaskSlotUsage: 64 # slot usage of index task per 512mbstatsTaskSlotUsage: 8 # slot usage of stats task per 512mbanalyzeTaskSlotUsage: 65535 # slot usage of analyze taskjsonStatsTriggerCount: 10 # jsonkey stats task count per triggerjsonStatsTriggerInterval: 10 # jsonkey task interval per triggerenabledJSONKeyStatsInSort: false # Indicates whether to enable JSON key stats task with sortjsonKeyStatsMemoryBudgetInTantivy: 16777216 # the memory budget for the JSON index In Tantivy, the unit is bytesip: # TCP/IP address of dataCoord. If not specified, use the first unicastable addressport: 13333 # TCP port of dataCoordgrpc:serverMaxSendSize: 536870912 # The maximum size of each RPC request that the dataCoord can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the dataCoord can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on dataCoord can send, unit: byteclientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on dataCoord can receive, unit: bytedataNode:dataSync:flowGraph:maxQueueLength: 16 # Maximum length of task queue in flowgraphmaxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraphmaxParallelSyncMgrTasks: 256 # The max concurrent sync task number of datanode sync mgr globallyskipMode:enable: true # Support skip some timetick message to reduce CPU usageskipNum: 4 # Consume one for every n records skippedcoldTime: 60 # Turn on skip mode after there are only timetick msg for x secondssegment:# The maximum size of each binlog file in a segment buffered in memory. Binlog files whose size exceeds this value are then flushed to MinIO or S3 service.# Unit: Byte# Setting this parameter too small causes the system to store a small amount of data too frequently. Setting it too large increases the system's demand for memory.insertBufSize: 16777216deleteBufBytes: 16777216 # Max buffer size in bytes to flush del for a single channel, default as 16MBsyncPeriod: 600 # The period to sync segments if buffer is not empty.memory:forceSyncEnable: true # Set true to force sync if memory usage is too highforceSyncSegmentNum: 1 # number of segments to sync, segments with top largest buffer will be synced.checkInterval: 3000 # the interal to check datanode memory usage, in millisecondsforceSyncWatermark: 0.5 # memory watermark for standalone, upon reaching this watermark, segments will be synced.timetick:interval: 500channel:# specify the size of global work pool of all channels# if this parameter <= 0, will set it as the maximum number of CPUs that can be executing# suggest to set it bigger on large collection numbers to avoid blockingworkPoolSize: -1# specify the size of global work pool for channel checkpoint updating# if this parameter <= 0, will set it as 10updateChannelCheckpointMaxParallel: 10updateChannelCheckpointInterval: 60 # the interval duration(in seconds) for datanode to update channel checkpoint of each channelupdateChannelCheckpointRPCTimeout: 20 # timeout in seconds for UpdateChannelCheckpoint RPC callmaxChannelCheckpointsPerPRC: 128 # The maximum number of channel checkpoints per UpdateChannelCheckpoint RPC.channelCheckpointUpdateTickInSeconds: 10 # The frequency, in seconds, at which the channel checkpoint updater executes updates.import:concurrencyPerCPUCore: 4 # The execution concurrency unit for import/pre-import tasks per CPU core.maxImportFileSizeInGB: 16 # The maximum file size (in GB) for an import file, where an import file refers to either a Row-Based file or a set of Column-Based files.readBufferSizeInMB: 64 # The insert buffer size (in MB) during import.readDeleteBufferSizeInMB: 16 # The delete buffer size (in MB) during import.compaction:levelZeroBatchMemoryRatio: 0.5 # The minimal memory ratio of free memory for level zero compaction executing in batch modelevelZeroMaxBatchSize: -1 # Max batch size refers to the max number of L1/L2 segments in a batch when executing L0 compaction. Default to -1, any value that is less than 1 means no limit. Valid range: >= 1.useMergeSort: false # Whether to enable mergeSort mode when performing mixCompaction.maxSegmentMergeSort: 30 # The maximum number of segments to be merged in mergeSort mode.gracefulStopTimeout: 1800 # seconds. force stop node without graceful stopslot:slotCap: 16 # The maximum number of tasks(e.g. compaction, importing) allowed to run concurrently on a datanodeclusteringCompaction:memoryBufferRatio: 0.3 # The ratio of memory buffer of clustering compaction. Data larger than threshold will be flushed to storage.workPoolSize: 8 # worker pool size for one clustering compaction job.bloomFilterApplyParallelFactor: 4 # parallel factor when to apply pk to bloom filter, default to 4*CPU_CORE_NUMstorage:deltalog: json # deltalog format, options: [json, parquet]ip: # TCP/IP address of dataNode. If not specified, use the first unicastable addressport: 21124 # TCP port of dataNodegrpc:serverMaxSendSize: 536870912 # The maximum size of each RPC request that the dataNode can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the dataNode can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on dataNode can send, unit: byteclientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on dataNode can receive, unit: byte# This topic introduces the message channel-related configurations of Milvus.msgChannel:chanNamePrefix:# Root name prefix of the channel when a message channel is created.# It is recommended to change this parameter before starting Milvus for the first time.# To share a Pulsar instance among multiple Milvus instances, consider changing this to a name rather than the default one for each Milvus instance before you start them.cluster: by-dev# Sub-name prefix of the message channel where the root coord publishes time tick messages.# The complete channel name prefix is ${msgChannel.chanNamePrefix.cluster}-${msgChannel.chanNamePrefix.rootCoordTimeTick}# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.rootCoordTimeTick: rootcoord-timetick# Sub-name prefix of the message channel where the root coord publishes its own statistics messages.# The complete channel name prefix is ${msgChannel.chanNamePrefix.cluster}-${msgChannel.chanNamePrefix.rootCoordStatistics}# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.rootCoordStatistics: rootcoord-statistics# Sub-name prefix of the message channel where the root coord publishes Data Manipulation Language (DML) messages.# The complete channel name prefix is ${msgChannel.chanNamePrefix.cluster}-${msgChannel.chanNamePrefix.rootCoordDml}# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.rootCoordDml: rootcoord-dmlreplicateMsg: replicate-msg# Sub-name prefix of the message channel where the query node publishes time tick messages.# The complete channel name prefix is ${msgChannel.chanNamePrefix.cluster}-${msgChannel.chanNamePrefix.queryTimeTick}# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.queryTimeTick: queryTimeTick# Sub-name prefix of the message channel where the data coord publishes time tick messages.# The complete channel name prefix is ${msgChannel.chanNamePrefix.cluster}-${msgChannel.chanNamePrefix.dataCoordTimeTick}# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.dataCoordTimeTick: datacoord-timetick-channel# Sub-name prefix of the message channel where the data coord publishes segment information messages.# The complete channel name prefix is ${msgChannel.chanNamePrefix.cluster}-${msgChannel.chanNamePrefix.dataCoordSegmentInfo}# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.dataCoordSegmentInfo: segment-info-channelsubNamePrefix:# Subscription name prefix of the data coord.# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.dataCoordSubNamePrefix: dataCoord# Subscription name prefix of the data node.# Caution: Changing this parameter after using Milvus for a period of time will affect your access to old data.# It is recommended to change this parameter before starting Milvus for the first time.dataNodeSubNamePrefix: dataNode# Configures the system log output.log:# Milvus log level. Option: debug, info, warn, error, panic, and fatal.# It is recommended to use debug level under test and development environments, and info level in production environment.level: infofile:# Root path to the log files.# The default value is set empty, indicating to output log files to standard output (stdout) and standard error (stderr).# If this parameter is set to a valid local path, Milvus writes and stores log files in this path.# Set this parameter as the path that you have permission to write.rootPath:maxSize: 300 # The maximum size of a log file, unit: MB.maxAge: 10 # The maximum retention time before a log file is automatically cleared, unit: day. The minimum value is 1.maxBackups: 20 # The maximum number of log files to back up, unit: day. The minimum value is 1.format: text # Milvus log format. Option: text and JSONstdout: true # Stdout enable or notgrpc:log:level: WARNINGgracefulStopTimeout: 3 # second, time to wait graceful stop finishclient:compressionEnabled: falsedialTimeout: 200keepAliveTime: 10000keepAliveTimeout: 20000maxMaxAttempts: 10initialBackoff: 0.2maxBackoff: 10backoffMultiplier: 2minResetInterval: 1000maxCancelError: 32minSessionCheckInterval: 200# Configure external tls.tls:serverPemPath: configs/cert/server.pemserverKeyPath: configs/cert/server.keycaPemPath: configs/cert/ca.pem# Configure internal tls.internaltls:serverPemPath: configs/cert/server.pemserverKeyPath: configs/cert/server.keycaPemPath: configs/cert/ca.pemsni: localhost # The server name indication (SNI) for internal TLS, should be the same as the name provided by the certificates ref: https://en.wikipedia.org/wiki/Server_Name_Indicationcommon:defaultPartitionName: _default # Name of the default partition when a collection is createddefaultIndexName: _default_idx # Name of the index when it is created with name unspecifiedentityExpiration: -1 # Entity expiration in seconds, CAUTION -1 means never expireindexSliceSize: 16 # Index slice size in MBthreadCoreCoefficient:highPriority: 10 # This parameter specify how many times the number of threads is the number of cores in high priority poolmiddlePriority: 5 # This parameter specify how many times the number of threads is the number of cores in middle priority poollowPriority: 1 # This parameter specify how many times the number of threads is the number of cores in low priority poolchunkCache: 10 # This parameter specify how many times the number of threads is the number of cores in chunk cache poolbuildIndexThreadPoolRatio: 0.75DiskIndex:MaxDegree: 56SearchListSize: 100PQCodeBudgetGBRatio: 0.125BuildNumThreadsRatio: 1SearchCacheBudgetGBRatio: 0.1LoadNumThreadRatio: 8BeamWidthRatio: 4gracefulTime: 5000 # milliseconds. it represents the interval (in ms) by which the request arrival time needs to be subtracted in the case of Bounded Consistency.gracefulStopTimeout: 1800 # seconds. it will force quit the server if the graceful stop process is not completed during this time.storageType: remote # please adjust in embedded Milvus: local, available values are [local, remote, opendal], value minio is deprecated, use remote instead# Default value: auto# Valid values: [auto, avx512, avx2, avx, sse4_2]# This configuration is only used by querynode and indexnode, it selects CPU instruction set for Searching and Index-building.simdType: auto# This parameter controls the write mode of the local disk, which is used to write temporary data downloaded from remote storage.# Currently, only QueryNode uses 'common.diskWrite*' parameters. Support for other components will be added in the future.# The options include 'direct' and 'buffered'. The default value is 'buffered'.diskWriteMode: buffered# Disk write buffer size in KB, only used when disk write mode is 'direct', default is 64KB.# Current valid range is [4, 65536]. If the value is not aligned to 4KB, it will be rounded up to the nearest multiple of 4KB.diskWriteBufferSizeKb: 64# This parameter controls the number of writer threads used for disk write operations. The valid range is [0, hardware_concurrency].# It is designed to limit the maximum concurrency of disk write operations to reduce the impact on disk read performance.# For example, if you want to limit the maximum concurrency of disk write operations to 1, you can set this parameter to 1.# The default value is 0, which means the caller will perform write operations directly without using an additional writer thread pool.# In this case, the maximum concurrency of disk write operations is determined by the caller's thread pool size.diskWriteNumThreads: 0diskWriteRateLimiter:refillPeriodUs: 100000 # refill period in microseconds if disk rate limiter is enabled, default is 100000us (100ms)avgKBps: 262144 # average kilobytes per second if disk rate limiter is enabled, default is 262144KB/s (256MB/s)maxBurstKBps: 524288 # max burst kilobytes per second if disk rate limiter is enabled, default is 524288KB/s (512MB/s)# amplification ratio for high priority tasks if disk rate limiter is enabled, value <= 0 means ratio limit is disabled.# The ratio is the multiplication factor of the configured bandwidth.# For example, if the rate limit is 100KB/s, and the high priority ratio is 2, then the high priority tasks will be limited to 200KB/s.highPriorityRatio: -1middlePriorityRatio: -1 # amplification ratio for middle priority tasks if disk rate limiter is enabled, value <= 0 means ratio limit is disabledlowPriorityRatio: -1 # amplification ratio for low priority tasks if disk rate limiter is enabled, value <= 0 means ratio limit is disabledsecurity:authorizationEnabled: true# The superusers will ignore some system check processes,# like the old password verification when updating the credentialsuperUsers: root# default password for root user. The maximum length is 72 characters.# Large numeric passwords require double quotes to avoid yaml parsing precision issues.defaultRootPassword: "xxxxxx"325 collapsed linesrootShouldBindRole: false # Whether the root user should bind a role when the authorization is enabled.enablePublicPrivilege: true # Whether to enable public privilegerbac:overrideBuiltInPrivilegeGroups:enabled: false # Whether to override build-in privilege groupscluster:readonly:privileges: ListDatabases,SelectOwnership,SelectUser,DescribeResourceGroup,ListResourceGroups,ListPrivilegeGroups # Cluster level readonly privilegesreadwrite:privileges: ListDatabases,SelectOwnership,SelectUser,DescribeResourceGroup,ListResourceGroups,ListPrivilegeGroups,FlushAll,TransferNode,TransferReplica,UpdateResourceGroups # Cluster level readwrite privilegesadmin:privileges: ListDatabases,SelectOwnership,SelectUser,DescribeResourceGroup,ListResourceGroups,ListPrivilegeGroups,FlushAll,TransferNode,TransferReplica,UpdateResourceGroups,BackupRBAC,RestoreRBAC,CreateDatabase,DropDatabase,CreateOwnership,DropOwnership,ManageOwnership,CreateResourceGroup,DropResourceGroup,UpdateUser,RenameCollection,CreatePrivilegeGroup,DropPrivilegeGroup,OperatePrivilegeGroup # Cluster level admin privilegesdatabase:readonly:privileges: ShowCollections,DescribeDatabase # Database level readonly privilegesreadwrite:privileges: ShowCollections,DescribeDatabase,AlterDatabase # Database level readwrite privilegesadmin:privileges: ShowCollections,DescribeDatabase,AlterDatabase,CreateCollection,DropCollection # Database level admin privilegescollection:readonly:privileges: Query,Search,IndexDetail,GetFlushState,GetLoadState,GetLoadingProgress,HasPartition,ShowPartitions,DescribeCollection,DescribeAlias,GetStatistics,ListAliases # Collection level readonly privilegesreadwrite:privileges: Query,Search,IndexDetail,GetFlushState,GetLoadState,GetLoadingProgress,HasPartition,ShowPartitions,DescribeCollection,DescribeAlias,GetStatistics,ListAliases,Load,Release,Insert,Delete,Upsert,Import,Flush,Compaction,LoadBalance,CreateIndex,DropIndex,CreatePartition,DropPartition # Collection level readwrite privilegesadmin:privileges: Query,Search,IndexDetail,GetFlushState,GetLoadState,GetLoadingProgress,HasPartition,ShowPartitions,DescribeCollection,DescribeAlias,GetStatistics,ListAliases,Load,Release,Insert,Delete,Upsert,Import,Flush,Compaction,LoadBalance,CreateIndex,DropIndex,CreatePartition,DropPartition,CreateAlias,DropAlias # Collection level admin privilegesinternaltlsEnabled: falsetlsMode: 0session:ttl: 30 # ttl value when session granting a lease to register serviceretryTimes: 30 # retry times when session sending etcd requestslocks:metrics:enable: false # whether gather statistics for metrics locksthreshold:info: 500 # minimum milliseconds for printing durations in info levelwarn: 1000 # minimum milliseconds for printing durations in warn levelmaxWLockConditionalWaitTime: 600 # maximum seconds for waiting wlock conditionalstorage:scheme: s3enablev2: false# Whether to disable the internal time messaging mechanism for the system.# If disabled (set to false), the system will not allow DML operations, including insertion, deletion, queries, and searches.# This helps Milvus-CDC synchronize incremental datattMsgEnabled: truetraceLogMode: 0 # trace request infobloomFilterSize: 100000 # bloom filter initial sizebloomFilterType: BlockedBloomFilter # bloom filter type, support BasicBloomFilter and BlockedBloomFiltermaxBloomFalsePositive: 0.001 # max false positive rate for bloom filterbloomFilterApplyBatchSize: 1000 # batch size when to apply pk to bloom filtercollectionReplicateEnable: false # Whether to enable collection replication.usePartitionKeyAsClusteringKey: false # if true, do clustering compaction and segment prune on partition key fielduseVectorAsClusteringKey: false # if true, do clustering compaction and segment prune on vector fieldenableVectorClusteringKey: false # if true, enable vector clustering key and vector clustering compactionlocalRPCEnabled: false # enable local rpc for internal communication when mix or standalone mode.sync:taskPoolReleaseTimeoutSeconds: 60 # The maximum time to wait for the task to finish and release resources in the poolenabledOptimizeExpr: true # Indicates whether to enable optimize exprenabledJSONKeyStats: false # Indicates sealedsegment whether to enable JSON key statsenabledGrowingSegmentJSONKeyStats: false # Indicates growingsegment whether to enable JSON key statsenableConfigParamTypeCheck: true # Indicates whether to enable config param type checkclusterID: 0 # cluster id# QuotaConfig, configurations of Milvus quota and limits.# By default, we enable:# 1. TT protection;# 2. Memory protection.# 3. Disk quota protection.# You can enable:# 1. DML throughput limitation;# 2. DDL, DQL qps/rps limitation;# 3. DQL Queue length/latency protection;# 4. DQL result rate protection;# If necessary, you can also manually force to deny RW requests.quotaAndLimits:enabled: true # `true` to enable quota and limits, `false` to disable.# quotaCenterCollectInterval is the time interval that quotaCenter# collects metrics from Proxies, Query cluster and Data cluster.# seconds, (0 ~ 65536)quotaCenterCollectInterval: 3forceDenyAllDDL: false # true to force deny all DDL requests, false to allow.limits:allocRetryTimes: 15 # retry times when delete alloc forward data from rate limit failedallocWaitInterval: 1000 # retry wait duration when delete alloc forward data rate failed, in millisecondcomplexDeleteLimitEnable: false # whether complex delete check forward data by limitermaxCollectionNum: 65536maxCollectionNumPerDB: 65536 # Maximum number of collections per database.maxInsertSize: -1 # maximum size of a single insert request, in bytes, -1 means no limitmaxResourceGroupNumOfQueryNode: 1024 # maximum number of resource groups of query nodesmaxGroupSize: 10 # maximum size for one single group when doing search group byddl:enabled: false # Whether DDL request throttling is enabled.# Maximum number of collection-related DDL requests per second.# Setting this item to 10 indicates that Milvus processes no more than 10 collection-related DDL requests per second, including collection creation requests, collection drop requests, collection load requests, and collection release requests.# To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.collectionRate: -1# Maximum number of partition-related DDL requests per second.# Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including partition creation requests, partition drop requests, partition load requests, and partition release requests.# To use this setting, set quotaAndLimits.ddl.enabled to true at the same time.partitionRate: -1db:collectionRate: -1 # qps of db level , default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollectionpartitionRate: -1 # qps of db level, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartitionindexRate:enabled: false # Whether index-related request throttling is enabled.# Maximum number of index-related requests per second.# Setting this item to 10 indicates that Milvus processes no more than 10 partition-related requests per second, including index creation requests and index drop requests.# To use this setting, set quotaAndLimits.indexRate.enabled to true at the same time.max: -1db:max: -1 # qps of db level, default no limit, rate for CreateIndex, DropIndexflushRate:enabled: true # Whether flush request throttling is enabled.# Maximum number of flush requests per second.# Setting this item to 10 indicates that Milvus processes no more than 10 flush requests per second.# To use this setting, set quotaAndLimits.flushRate.enabled to true at the same time.max: -1collection:max: 0.1 # qps, default no limit, rate for flush at collection level.db:max: -1 # qps of db level, default no limit, rate for flushcompactionRate:enabled: false # Whether manual compaction request throttling is enabled.# Maximum number of manual-compaction requests per second.# Setting this item to 10 indicates that Milvus processes no more than 10 manual-compaction requests per second.# To use this setting, set quotaAndLimits.compaction.enabled to true at the same time.max: -1db:max: -1 # qps of db level, default no limit, rate for manualCompactiondbRate:enabled: false # Whether DB request throttling is enabled# Maximum number of db-related requests per second.# Setting this item to 10 indicates that Milvus processes no more than 10 db-related requests per second, including db creation/drop/alter requests.# To use this setting, set quotaAndLimits.dbRate.enabled to true at the same time.#max: -1dml:enabled: false # Whether DML request throttling is enabled.insertRate:# Highest data insertion rate per second.# Setting this item to 5 indicates that Milvus only allows data insertion at the rate of 5 MB/s.# To use this setting, set quotaAndLimits.dml.enabled to true at the same time.max: -1db:max: -1 # MB/s, default no limitcollection:# Highest data insertion rate per collection per second.# Setting this item to 5 indicates that Milvus only allows data insertion to any collection at the rate of 5 MB/s.# To use this setting, set quotaAndLimits.dml.enabled to true at the same time.max: -1partition:max: -1 # MB/s, default no limitupsertRate:max: -1 # MB/s, default no limitdb:max: -1 # MB/s, default no limitcollection:max: -1 # MB/s, default no limitpartition:max: -1 # MB/s, default no limitdeleteRate:# Highest data deletion rate per second.# Setting this item to 0.1 indicates that Milvus only allows data deletion at the rate of 0.1 MB/s.# To use this setting, set quotaAndLimits.dml.enabled to true at the same time.max: -1db:max: -1 # MB/s, default no limitcollection:# Highest data deletion rate per second.# Setting this item to 0.1 indicates that Milvus only allows data deletion from any collection at the rate of 0.1 MB/s.# To use this setting, set quotaAndLimits.dml.enabled to true at the same time.max: -1partition:max: -1 # MB/s, default no limitbulkLoadRate:max: -1 # MB/s, default no limit, not support yet. TODO: limit bulkLoad ratedb:max: -1 # MB/s, default no limit, not support yet. TODO: limit db bulkLoad ratecollection:max: -1 # MB/s, default no limit, not support yet. TODO: limit collection bulkLoad ratepartition:max: -1 # MB/s, default no limit, not support yet. TODO: limit partition bulkLoad ratedql:enabled: false # Whether DQL request throttling is enabled.searchRate:# Maximum number of vectors to search per second.# Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second no matter whether these 100 vectors are all in one search or scattered across multiple searches.# To use this setting, set quotaAndLimits.dql.enabled to true at the same time.max: -1db:max: -1 # vps (vectors per second), default no limitcollection:# Maximum number of vectors to search per collection per second.# Setting this item to 100 indicates that Milvus only allows searching 100 vectors per second per collection no matter whether these 100 vectors are all in one search or scattered across multiple searches.# To use this setting, set quotaAndLimits.dql.enabled to true at the same time.max: -1partition:max: -1 # vps (vectors per second), default no limitqueryRate:# Maximum number of queries per second.# Setting this item to 100 indicates that Milvus only allows 100 queries per second.# To use this setting, set quotaAndLimits.dql.enabled to true at the same time.max: -1db:max: -1 # qps, default no limitcollection:# Maximum number of queries per collection per second.# Setting this item to 100 indicates that Milvus only allows 100 queries per collection per second.# To use this setting, set quotaAndLimits.dql.enabled to true at the same time.max: -1partition:max: -1 # qps, default no limitlimitWriting:# forceDeny false means dml requests are allowed (except for some# specific conditions, such as memory of nodes to water marker), true means always reject all dml requests.forceDeny: falsettProtection:enabled: false# maxTimeTickDelay indicates the backpressure for DML Operations.# DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,# if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.# secondsmaxTimeTickDelay: 300memProtection:# When memory usage > memoryHighWaterLevel, all dml requests would be rejected;# When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;# When memory usage < memoryLowWaterLevel, no action.enabled: truedataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodesdataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodesqueryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodesqueryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodesgrowingSegmentsSizeProtection:# No action will be taken if the growing segments size is less than the low watermark.# When the growing segments size exceeds the low watermark, the dml rate will be reduced,# but the rate will not be lower than minRateRatio * dmlRate.enabled: falseminRateRatio: 0.5lowWaterLevel: 0.2highWaterLevel: 0.4diskProtection:enabled: true # When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;diskQuota: -1 # MB, (0, +inf), default no limitdiskQuotaPerDB: -1 # MB, (0, +inf), default no limitdiskQuotaPerCollection: -1 # MB, (0, +inf), default no limitdiskQuotaPerPartition: -1 # MB, (0, +inf), default no limitl0SegmentsRowCountProtection:enabled: false # switch to enable l0 segment row count quotalowWaterLevel: 30000000 # l0 segment row count quota, low water levelhighWaterLevel: 50000000 # l0 segment row count quota, high water leveldeleteBufferRowCountProtection:enabled: false # switch to enable delete buffer row count quotalowWaterLevel: 32768 # delete buffer row count quota, low water levelhighWaterLevel: 65536 # delete buffer row count quota, high water leveldeleteBufferSizeProtection:enabled: false # switch to enable delete buffer size quotalowWaterLevel: 134217728 # delete buffer size quota, low water levelhighWaterLevel: 268435456 # delete buffer size quota, high water levellimitReading:# forceDeny false means dql requests are allowed (except for some# specific conditions, such as collection has been dropped), true means always reject all dql requests.forceDeny: falsetrace:# trace exporter type, default is stdout,# optional values: ['noop','stdout', 'jaeger', 'otlp']exporter: noop# fraction of traceID based sampler,# optional values: [0, 1]# Fractions >= 1 will always sample. Fractions < 0 are treated as zero.sampleFraction: 0jaeger:url: # when exporter is jaeger should set the jaeger's URLotlp:endpoint: # example: "127.0.0.1:4317" for grpc, "127.0.0.1:4318" for httpmethod: # otlp export method, acceptable values: ["grpc", "http"], using "grpc" by defaultsecure: trueinitTimeoutSeconds: 10 # segcore initialization timeout in seconds, preventing otlp grpc hangs forever#when using GPU indexing, Milvus will utilize a memory pool to avoid frequent memory allocation and deallocation.#here, you can set the size of the memory occupied by the memory pool, with the unit being MB.#note that there is a possibility of Milvus crashing when the actual memory demand exceeds the value set by maxMemSize.#if initMemSize and MaxMemSize both set zero,#milvus will automatically initialize half of the available GPU memory,#maxMemSize will the whole available GPU memory.gpu:initMemSize: 2048 # Gpu Memory Pool init sizemaxMemSize: 4096 # Gpu Memory Pool Max size# Any configuration related to the streaming node server.streamingNode:ip: # TCP/IP address of streamingNode. If not specified, use the first unicastable addressport: 22222 # TCP port of streamingNodegrpc:serverMaxSendSize: 268435456 # The maximum size of each RPC request that the streamingNode can send, unit: byteserverMaxRecvSize: 268435456 # The maximum size of each RPC request that the streamingNode can receive, unit: byteclientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on streamingNode can send, unit: byteclientMaxRecvSize: 268435456 # The maximum size of each RPC request that the clients on streamingNode can receive, unit: byte# Any configuration related to the streaming service.streaming:walBalancer:# The interval of balance task trigger at background, 1 min by default.# It's ok to set it into duration string, such as 30s or 1m30s, see time.ParseDurationtriggerInterval: 1m# The initial interval of balance task trigger backoff, 50 ms by default.# It's ok to set it into duration string, such as 30s or 1m30s, see time.ParseDurationbackoffInitialInterval: 50msbackoffMultiplier: 2 # The multiplier of balance task trigger backoff, 2 by defaultwalBroadcaster:concurrencyRatio: 1 # The concurrency ratio based on number of CPU for wal broadcaster, 1 by default.txn:defaultKeepaliveTimeout: 10s # The default keepalive timeout for wal txn, 10s by default# Any configuration related to the knowhere vector search engineknowhere:enable: true # When enable this configuration, the index parameters defined following will be automatically populated as index parameters, without requiring user input.DISKANN:build:max_degree: 56 # Maximum degree of the Vamana graphpq_code_budget_gb_ratio: 0.125 # Size limit on the PQ code (compared with raw data)search_cache_budget_gb_ratio: 0.1 # Ratio of cached node numbers to raw datasearch_list_size: 100 # Size of the candidate list during building graphsearch:beam_width_ratio: 4 # Ratio between the maximum number of IO requests per search iteration and CPU number - 编辑
docker-compose.yamldocker-compose.yaml services:etcd:container_name: milvus-etcdimage: quay.io/coreos/etcd:v3.5.18environment:- ETCD_AUTO_COMPACTION_MODE=revision- ETCD_AUTO_COMPACTION_RETENTION=1000- ETCD_QUOTA_BACKEND_BYTES=4294967296- ETCD_SNAPSHOT_COUNT=50000volumes:- ./etcd:/etcdcommand: etcd -advertise-client-urls=http://etcd:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcdhealthcheck:test: ["CMD", "etcdctl", "endpoint", "health"]interval: 30stimeout: 20sretries: 3standalone:container_name: milvus-standaloneimage: milvusdb/milvus:v2.5.19command: ["milvus", "run", "standalone"]security_opt:- seccomp:unconfinedenvironment:- ETCD_ENDPOINTS=etcd:2379- TIMEZONE=Asia/Shanghaivolumes:- ./milvus.yaml:/milvus/configs/milvus.yaml- ./milvus:/var/lib/milvushealthcheck:test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]interval: 30sstart_period: 90stimeout: 20sretries: 3ports:- "65011:19530" # gRPC- "65012:9091" # HTTP管理面板(不知道是不是有bug,设置了用户密码依旧能直接访问,有知道咋解决的大佬请评论下咋弄))depends_on:- "etcd" - 启动Milvus
Terminal window docker-compose up -d
配置Nginx Proxy Manager反代gRPC协议
- 如图配置:
# 第一行可以删除,主要是用于在用cf cdn后让npm获取真实ip的# 如果需要的话,请参考https://blog.useforall.com/posts/nginx-proxy-manager-get-real-client-ip-a-unified-solution进行配置include /data/nginx/custom/cloudflare_ips.conf;underscores_in_headers on;location / {# 验证是否为 gRPC 请求 (可选但推荐)if ($content_type !~ "application/grpc") {return 404;}# 超时和保活设置grpc_read_timeout 300s;grpc_send_timeout 300s;grpc_socket_keepalive on;grpc_pass grpc://172.17.0.1:65011;}access_log off; - 去Cloudflare打开gRPC开关
注意有些人这个地方可能会是
Join Beta,点击加入即可。我有些域名直接可以开,有些显示要加入Beta。具体看个人情况。 加入Beta后会显示:Thanks for your interest! You will be able to enable gRPC support once you have been admitted to the beta.具体要多久也不清楚,可能几分钟,也可能几小时不等。
- 最后请参照官方教程连接Milvus数据库
Docker部署Milvus并使用Nginx Proxy Manager反代gRPC协议
https://blog.useforall.com/posts/docker-deploy-milvus-with-nginx-proxy-manager-grpc/ 最后更新于 2025-10-27,距今已过 20 天
部分内容可能已过时
Lim's Blog