I'd like to share something very brief and very obvious - that compression works better with large amounts of data. That is, if you have to compress 100 sentences you'd better compress them in bulk rather than once sentence at a time. Let me illustrate that:
Java
x
13
1
public static void main(String[] args) throws Exception {
2
List<String> sentences = new ArrayList<>();
3
for (int i = 0; i < 100; i ++) {
4
StringBuilder sentence = new StringBuilder();
5
for (int j = 0; j < 100; j ++) {
6
sentence.append(RandomStringUtils.randomAlphabetic(10)).append(" ");
7
}
8
sentences.add(sentence.toString());
9
}
10
byte[] compressed = compress(StringUtils.join(sentences, ". "));
11
System.out.println(compressed.length);
12
System.out.println(sentences.stream().collect(Collectors.summingInt(sentence -> compress(sentence).length)));
13
}
The compress method is using commons-compress to easily generate results for multiple compression algorithms: