1、安装依赖项
2、安装并启动Cassandra
tar -xzvf apache-cassandra-3.11.4-bin.tar.gz
# Directory where Cassandra should store hints.
# If not set, the default directory is $CASSANDRA_HOME/data/hints.
hints_directory:/data/cassandra/hints
# Directories where Cassandra should store data on disk. Cassandra
# will spread data evenly across them, subject to the granularity of
# the configured compaction strategy.
# If not set, the default directory is $CASSANDRA_HOME/data/data.
# data_file_directories:
# - /var/lib/cassandra/data
data_file_directories:
- /data1/cassandra/data
- /data2/cassandra/data
# commit log. when running on magnetic HDD, this should be a
# separate spindle than the data directories.
# If not set, the default directory is $CASSANDRA_HOME/data/commitlog.
commitlog_directory: /data/data/cassandra/commitlog
# saved caches
# If not set, the default directory is $CASSANDRA_HOME/data/saved_caches.
saved_caches_directory: /data/cassandra/saved_caches
# any class that implements the SeedProvider interface and has a
# constructor that takes a Map<String, String> of parameters will do.
seed_provider:
# Addresses of hosts that are deemed contact points.
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring. You must change this if you are running
# multiple nodes!
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
# seeds is actually a comma-delimited list of addresses.
# Ex: "<ip1>,<ip2>,<ip3>"
- seeds: "10.10.10.1,10.10.10.2"
# Set listen_address OR listen_interface, not both. Interfaces must correspond
# to a single address, IP aliasing is not supported.
listen_interface: bond1
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.10.10.1 11.7 TiB 256 ? 85ad2d5f-10af-475a-915f-f95dd705e133 rack1
UN 10.10.10.2 10.1 TiB 256 ? 568eb231-eaa4-429c-b2fa-9b3a7db36ffd rack1
UN 10.10.10.3 10.4 TiB 256 ? 4bf77471-4a83-4217-ba5c-96ce1bbd7647 rack1
1、命令行启动
执行bin/cqlsh ip即可cassandra的shell客户端,可以执行的CQL命令和mysql比较类似。
# 创建keyspace(类似mysql的database),replication_factor表示副本数,class可选SimpleStrategy和NetworkTopologyStrategy
# 如果集群分布在多个数据中心,class选择NetworkTopologyStrategy。durable_writes表示更新记录时是否使用commit log
create keyspace mykeyspace with replication={'class':'SimpleStrategy','replication_factor':3} AND durable_writes = true;
use mykeyspace;
#创建表,其中domain是partition key
CREATE TABLE t(
domain text,
day text,
key text,
value int,
PRIMARY KEY (domain, day, key)
) WITH CLUSTERING ORDER BY (domain DESC);
# 表的增、删、改、查操作类似mysql
INSERT INTO t (domain, day, key, value) VALUES ('test.com', '2019-02-24', 'key0', 10);
SELECT * FROM t;
2、查询条件
Cassandra查询时只支持主键列及索引列的查询,主键查询必须按照主键顺序指定查询条件。
CREATE TABLE t(
domain text,
day text,
key text,
value int,
PRIMARY KEY (domain, day, key)
) WITH CLUSTERING ORDER BY (domain DESC);
可以支持的查询条件
select * from t where domain = ?
select * from t where domain = ? and day = ?
select * from t where domain = ? and day > ?
如果要以domain和key作为查询条件,需要加上allow filtering
select * from t where domain = ? and key = ? allow filtering
Cassandra的索引实际上是建立了一张新的表,将原始表格的索引字段作为新的索引表的primary key,并且存储的值为原始表的primary key。所以实际上并不推荐使用二级索引,如果需要根据非primary key字段来做条件过滤,通常是业务侧新创建一个表,以索引字段作为primary key。
1、增加节点
bin/cassandra &
如果要同时增加多台机器,则增加一个-Dcassandra.consistent.rangemovement=false选项
2、移除节点
bin/nodetool removenode host_id
host_id 通过bin/nodetool status查看每个节点的host_id
如果任务一直未完成,可以执行
bin/nodetool removenode force
1、节点扩容时有的文章建议先关闭cassandra节点的压缩(bin/nodetool disableautocompaction),但从实际的情况来看,如果集群数据量较大时,扩容后节点之间的数据迁移可能需要数小时才能完成。在扩容的过程中会产生大量的小文件,重新开启压缩时有大量文件需要压缩,有可能导致磁盘IO飙升而影响使用
2、创建Cassandra表时,不要在多个地方同时执行create table命令,即使加了 if not exists。多个client同时创建表有可能导致cassandra出现org.apache.cassandra.db.UnknownColumnFamilyException的错误
3、某些commit log损坏导致Cassandra进程关闭并且无法启动,如果Cassandra有多副本的话,删除损坏的commit log文件并重启就行
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。