1 |
0.7.0 |
2 |
===== |
3 |
|
4 |
Features |
5 |
-------- |
6 |
- Secondary indexes (indexes on column values) are now supported |
7 |
- Row size limit increased from 2GB to 2 billion columns. rows |
8 |
are no longer read into memory during compaction. |
9 |
- Keyspace and ColumnFamily definitions may be added and modified live |
10 |
- Streaming data for repair or node movement no longer requires |
11 |
anticompaction step first |
12 |
- NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for |
13 |
use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments |
14 |
in `cassandra.yaml.` |
15 |
- Optional per-Column time-to-live field allows expiring data without |
16 |
have to issue explicit remove commands |
17 |
- `truncate` thrift method allows clearing an entire ColumnFamily at once |
18 |
- Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out] |
19 |
support |
20 |
- Up to 8x faster reads from row cache |
21 |
- A new ByteOrderedPartitioner supports bytes keys with arbitrary content, |
22 |
and orders keys by their byte value. This should be used in new |
23 |
deployments instead of OrderPreservingPartitioner. |
24 |
- Optional round-robin scheduling between keyspaces for multitenant |
25 |
clusters |
26 |
- Dynamic endpoint snitch mitigates the impact of impaired nodes |
27 |
- New `IntegerType`, faster than LongType and allows integers of |
28 |
both less and more bits than Long's 64 |
29 |
- A revamped authentication system that decouples authorization and |
30 |
allows finer-grained control of resources. |
31 |
|
32 |
Upgrading |
33 |
--------- |
34 |
The Thrift API has changed in incompatible ways; see below, and refer |
35 |
to http://wiki.apache.org/cassandra/ClientOptions for a list of |
36 |
higher-level clients that have been updated to support the 0.7 API. |
37 |
|
38 |
The Cassandra inter-node protocol is incompatible with 0.6.x |
39 |
releases (and with 0.7 beta1), meaning you will have to bring your |
40 |
cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes. |
41 |
|
42 |
The hints schema was changed from 0.6 to 0.7. Cassandra automatically |
43 |
snapshots and then truncates the hints column family as part of |
44 |
starting up 0.7 for the first time. |
45 |
|
46 |
Keyspace and ColumnFamily definitions are stored in the system |
47 |
keyspace, rather than the configuration file. |
48 |
|
49 |
The process to upgrade is: |
50 |
1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log |
51 |
message "Node is drained" appears), stop the process. |
52 |
2) Convert your storage-conf.xml to the new cassandra.yaml using |
53 |
"bin/config-converter". |
54 |
3) Rename any of your keyspace or column family names that do not adhere |
55 |
to the '^\w+' regex convention. |
56 |
4) Start up your cluster with the 0.7 version. |
57 |
5) Initialize your Keyspace and ColumnFamily definitions using |
58 |
"bin/schematool <host> <jmxport> import". _You only need to do |
59 |
this to one node_. |
60 |
|
61 |
Thrift API |
62 |
---------- |
63 |
- The Cassandra server now defaults to framed mode, rather than |
64 |
unframed. Unframed is obsolete and will be removed in the future. |
65 |
- The Cassandra Thrift interface file has been updated for Thrift 0.5. |
66 |
If you are compiling your own client code from the interface, you |
67 |
will need to upgrade the Thrift compiler to match. |
68 |
- Row keys are now bytes: keys stored by versions prior to 0.7.0 will be |
69 |
returned as UTF-8 encoded bytes. OrderPreservingPartitioner and |
70 |
CollatingOrderPreservingPartitioner continue to expect that keys contain |
71 |
UTF-8 encoded strings, but RandomPartitioner now works on any key data. |
72 |
- keyspace parameters have been replaced with the per-connection |
73 |
set_keyspace method. |
74 |
- The return type for login() is now AccessLevel. |
75 |
- The get_string_property() method has been removed. |
76 |
- The get_string_list_property() method has been removed. |
77 |
|
78 |
Configuraton |
79 |
------------ |
80 |
- Configuration file renamed to cassandra.yaml and log4j.properties to |
81 |
log4j-server.properties |
82 |
- PropertyFileSnitch configuration file renamed to |
83 |
cassandra-topology.properties |
84 |
- The ThriftAddress and ThriftPort directives have been renamed to |
85 |
RPCAddress and RPCPort respectively. |
86 |
- EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch |
87 |
has been added. |
88 |
- RackUnawareStrategy and RackAwareStrategy have been renamed to |
89 |
SimpleStrategy and OldNetworkTopologyStrategy, respectively. |
90 |
- RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb |
91 |
- GCGraceSeconds is now per-ColumnFamily instead of global |
92 |
- Keyspace and column family names that do not confirm to a '^\w+' regex |
93 |
are considered illegal. |
94 |
- Keyspace and column family definitions will need to be loaded via |
95 |
"bin/schematool <host> <jmxport> import". _You only need to do this to |
96 |
one node_. |
97 |
- In addition to an authenticator, an authority must be configured as |
98 |
well. Users of SimpleAuthenticator should use SimpleAuthority for this |
99 |
value (the default is AllowAllAuthority, which corresponds with |
100 |
AllowAllAuthenticator). |
101 |
- The format of access.properties has changed, see the sample configuration |
102 |
conf/access.properties for documentation on the new format. |
103 |
|
104 |
|
105 |
JMX |
106 |
--- |
107 |
- StreamingService moved from o.a.c.streaming to o.a.c.service |
108 |
- GMFD renamed to GOSSIP_STAGE |
109 |
- {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize |
110 |
since it no longer has to wait til compaction to be computed |
111 |
|
112 |
Other |
113 |
----- |
114 |
- If extending AbstractType, make sure you follow the singleton pattern |
115 |
followed by Cassandra core AbstractType classes: provide a public |
116 |
static final variable called 'instance'. |
117 |
|
118 |
|
119 |
0.6.6 |
120 |
===== |
121 |
|
122 |
Upgrading |
123 |
--------- |
124 |
- As part of the cache-saving feature, a third directory |
125 |
(along with data and commitlog) has been added to the config |
126 |
file. You will need to set and create this directory |
127 |
when restarting your node into 0.6.6. |
128 |
|
129 |
|
130 |
0.6.1 |
131 |
===== |
132 |
|
133 |
Upgrading |
134 |
--------- |
135 |
- We try to keep minor versions 100% compatible (data format, |
136 |
commitlog format, network format) within the major series, but |
137 |
we introduced a network-level incompatibility in this 0.6.1. |
138 |
Thus, if you are upgrading from 0.6.0 to any higher version |
139 |
(0.6.1, 0.6.2, etc.) then you will need to restart your entire |
140 |
cluster with the new version, instead of being able to do a |
141 |
rolling restart. |
142 |
|
143 |
|
144 |
0.6.0 |
145 |
===== |
146 |
|
147 |
Features |
148 |
-------- |
149 |
- row caching: configure with the RowsCached attribute in |
150 |
ColumnFamily definition |
151 |
- Hadoop map/reduce support: see contrib/word_count for an example |
152 |
- experimental authentication support, described under |
153 |
Authenticator in storage.conf |
154 |
|
155 |
Configuraton |
156 |
------------ |
157 |
- MemtableSizeInMB has been replaced by MemtableThroughputInMB which |
158 |
triggers a memtable flush when the specified amount of data has |
159 |
been written, including overwrites. |
160 |
- MemtableObjectCountInMillions has been replaced by the |
161 |
MemtableOperationsInMillions directive which causes a memtable flush |
162 |
to occur after the specified number of operations. |
163 |
- Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by |
164 |
BinaryMemtableThroughputInMB. |
165 |
- Replication factor is now per-keyspace, rather than global. |
166 |
- KeysCachedFraction is deprecated in favor of KeysCached |
167 |
- RowWarningThresholdInMB added, to warn before very large rows |
168 |
get big enough to threaten node stability |
169 |
|
170 |
Thrift API |
171 |
---------- |
172 |
- removed deprecated get_key_range method |
173 |
- added batch_mutate meethod |
174 |
- deprecated multiget and batch_insert methods in favor of |
175 |
multiget_slice and batch_mutate, respectively |
176 |
- added ConsistencyLevel.ANY, for when you want write |
177 |
availability even when it may not be readable immediately. |
178 |
Unlike CL.ZERO, though, it will throw an exception if |
179 |
it cannot be written *somewhere*. |
180 |
|
181 |
JMX metrics |
182 |
----------- |
183 |
- read and write statistics are reported as lifetime totals, |
184 |
instead of averages over the last minute. average-since-last |
185 |
requested are also available for convenience. |
186 |
- cache hit rate statistics are now available from JMX under |
187 |
org.apache.cassandra.db.Caches |
188 |
- compaction JMX metrics are moved to |
189 |
org.apache.cassandra.db.CompactionManager. PendingTasks is now |
190 |
a much better estimate of compactions remaining, and the |
191 |
progress of the current compaction has been added. |
192 |
- commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog |
193 |
- progress of data streaming during bootstrap, loadbalance, or other |
194 |
data migration, is available under |
195 |
org.apache.cassandra.streaming.StreamingService. |
196 |
See http://wiki.apache.org/cassandra/Streaming for details. |
197 |
|
198 |
Installation/Upgrade |
199 |
-------------------- |
200 |
- 0.6 network traffic is not compatible with earlier versions. You |
201 |
will need to shut down all your nodes at once, upgrade, then restart. |
202 |
|
203 |
|
204 |
|
205 |
0.5.0 |
206 |
===== |
207 |
|
208 |
0. The commitlog format has changed (but sstable format has not). |
209 |
When upgrading from 0.4, empty the commitlog either by running |
210 |
bin/nodeprobe flush on each machine and waiting for the flush to finish, |
211 |
or simply remove the commitlog directory if you only have test data. |
212 |
(If more writes come in after the flush command, starting 0.5 will error |
213 |
out; if that happens, just go back to 0.4 and flush again.) |
214 |
The format changed twice: from 0.4 to beta1, and from beta2 to RC1. |
215 |
|
216 |
.5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist |
217 |
in a cluster of 0.4 nodes or vice versa; you must upgrade your |
218 |
whole cluster at the same time. |
219 |
|
220 |
1. Bootstrap, move, load balancing, and active repair have been added. |
221 |
See http://wiki.apache.org/cassandra/Operations. When upgrading |
222 |
from 0.4, leave autobootstrap set to false for the first restart |
223 |
of your old nodes. |
224 |
|
225 |
2. Performance improvements across the board, especially on the write |
226 |
path (over 100% improvement in stress.py throughput). |
227 |
|
228 |
3. Configuration: |
229 |
- Added "comment" field to ColumnFamily definition. |
230 |
- Added MemtableFlushAfterMinutes, a global replacement for the |
231 |
old per-CF FlushPeriodInMinutes setting |
232 |
- Key cache settings |
233 |
|
234 |
4. Thrift: |
235 |
- Added get_range_slice, deprecating get_key_range |
236 |
|
237 |
|
238 |
|
239 |
0.4.2 |
240 |
===== |
241 |
|
242 |
1. Improve default garbage collector options significantly -- |
243 |
throughput will be 30% higher or more. |
244 |
|
245 |
|
246 |
|
247 |
0.4.1 |
248 |
===== |
249 |
|
250 |
1. SnapshotBeforeCompaction configuration option allows snapshotting |
251 |
before each compaction, which allows rolling back to any version |
252 |
of the data. |
253 |
|
254 |
|
255 |
|
256 |
0.4.0 |
257 |
===== |
258 |
|
259 |
1. On-disk data format has changed to allow billions of keys/rows per |
260 |
node instead of only millions. The new format is incompatible with 0.3; |
261 |
see 0.3 notes below for how to import data from a 0.3 install. |
262 |
|
263 |
2. Cassandra now supports multiple keyspaces. Typically you will have |
264 |
one keyspace per application, allowing applications to be able to |
265 |
create and modify ColumnFamilies at will without worrying about |
266 |
collisions with others in the same cluster. |
267 |
|
268 |
3. Many Thrift API changes and documentation. See |
269 |
http://wiki.apache.org/cassandra/API |
270 |
|
271 |
4. Removed the web interface in favor of JMX and bin/nodeprobe, which |
272 |
has significantly enhanced functionality. |
273 |
|
274 |
5. Renamed configuration "<Table>" to "<Keyspace>". |
275 |
|
276 |
6. Added commitlog fsync; see "<CommitLogSync>" in configuration. |
277 |
|
278 |
|
279 |
|
280 |
0.3.0 |
281 |
===== |
282 |
|
283 |
1. With enough and large enough keys in a ColumnFamily, Cassandra will |
284 |
run out of memory trying to perform compactions (data file merges). |
285 |
The size of what is stored in memory is (S + 16) * (N + M) where S |
286 |
is the size of the key (usually 2 bytes per character), N is the |
287 |
number of keys and M, is the map overhead (which can be guestimated |
288 |
at around 32 bytes per key). |
289 |
So, if you have 10-character keys and 1GB of headroom in your heap |
290 |
space for compaction, you can expect to store about 17M keys |
291 |
before running into problems. |
292 |
See https://issues.apache.org/jira/browse/CASSANDRA-208 |
293 |
|
294 |
2. Because fixing #1 requires a data file format change, 0.4 will not |
295 |
be binary-compatible with 0.3 data files. A client-side upgrade |
296 |
can be done relatively easily with the following algorithm: |
297 |
for key in old_client.get_key_range(everything): |
298 |
columns = old_client.get_slice or get_slice_super(key, all columns) |
299 |
new_client.batch_insert or batch_insert_super(key, columns) |
300 |
The inner loop can be trivially parallelized for speed. |
301 |
|
302 |
3. Commitlog does not fsync before reporting a write successful. |
303 |
Using blocking writes mitigates this to some degree, since all |
304 |
nodes that were part of the write quorum would have to fail |
305 |
before sync for data to be lost. |
306 |
See https://issues.apache.org/jira/browse/CASSANDRA-182 |
307 |
|
308 |
Additionally, row size (that is, all the data associated with a single |
309 |
key in a given ColumnFamily) is limited by available memory, because |
310 |
compaction deserializes each row before merging. |
311 |
|
312 |
See https://issues.apache.org/jira/browse/CASSANDRA-16 |
313 |
|