0 Stream简介
- 家庭住址 :
java.util.stream.Stream<T>
- 出生年月:Java8问世的时候他就来到了世上
- 主要技能:那可以吹上三天三夜了……
- 主要特征
- 不改变输入源
- 中间的各种操作是
lazy
的(惰性求值、延迟操作)
- 只有当开始消费流的时候,流才有意义
- 隐式迭代
- ……
总体感觉,Stream
相当于一个进化版的Iterator
。Java8源码里是这么注释的:
A sequence of elements supporting sequential and parallel aggregate operations
可以方便的对集合进行遍历、过滤、映射、汇聚、切片等复杂操作。最终汇聚
成一个新的Stream,不改变原始数据
。并且各种复杂的操作都是lazy
的,也就是说会尽可能的将所有的中间操作在最终的汇聚操作一次性完成。
比起传统的对象和数据的操作
,Stream更专注于对流的计算
,和传说中的函数式编程有点类似。
他具体进化的多牛逼,自己体验吧。
给一组输入数据:
1
| List<Integer> list = Arrays.asList(1, null, 3, 1, null, 4, 5, null, 2, 0);
|
求输入序列中非空奇数之和,并且相同奇数算作同一个。
- 在lambda还在娘胎里的时候,为了实现这个功能,可能会这么做
1 2 3 4 5 6 7 8 9
| int s = 0; Set<Integer> set = new HashSet<>(list); for (Integer i : set) { if (i != null && (i & 1) == 0) { s += i; } } System.out.println(s);
|
1
| int sum = list.stream().filter(e -> e != null && (e & 1) == 1).distinct().mapToInt(i -> i).sum();
|
1 获取Stream
从1.8开始,接口中也可以存在 default
修饰的方法了。
java.util.Collection<E>
中有如下声明:
1 2 3 4 5 6 7 8 9 10
| public interface Collection<E> extends Iterable<E> { default Stream<E> stream() { return StreamSupport.stream(spliterator(), false); } default Stream<E> parallelStream() { return StreamSupport.stream(spliterator(), true); } }
|
java.util.Arrays
中有如下声明:
1 2 3 4 5 6 7 8 9
| public static <T> Stream<T> stream(T[] array) { return stream(array, 0, array.length); } public static IntStream stream(int[] array) { return stream(array, 0, array.length); }
|
示例
1 2 3 4
| List<String> strs = Arrays.asList("apache", "spark"); Stream<String> stringStream = strs.stream(); IntStream intStream = Arrays.stream(new int[] { 1, 25, 4, 2 });
|
1 2 3 4 5 6 7
| Stream<String> stream = Stream.of("hello", "world"); Stream<String> stream2 = Stream.of("haha"); Stream<HouseInfo> stream3 = Stream.of(new HouseInfo[] { new HouseInfo(), new HouseInfo() }); Stream<Integer> stream4 = Stream.iterate(1, i -> 2 * i + 1); Stream<Double> stream5 = Stream.generate(() -> Math.random());
|
注意:Stream.iterate()
和 Stream.generate()
生成的是无限流
,一般要手动limit
。
2 转换Stream
流过滤、流切片
这部分相对来说还算简单明了,看个例子就够了
1 2 3 4 5 6 7 8 9 10 11
| Stream<String> stream = Stream.of( null, "apache", null, "apache", "apache", "github", "docker", "java", "hadoop", "linux", "spark", "alifafa"); stream .filter(e -> e != null && e.contains("a")) .distinct() .limit(3) .forEach(System.out::println);
|
map/flatMap
Stream的map定义如下:
1
| <R> Stream<R> map(Function<? super T, ? extends R> mapper);
|
也就是说,接收一个输入(T:当前正在迭代的元素),输出另一种类型(R)。
1 2 3 4 5 6
| Stream.of(null, "apache", null, "apache", "apache", "hadoop", "linux", "spark", "alifafa") .filter(e -> e != null && e.length() > 0) .map(str -> str.charAt(0)) .forEach(System.out::println);
|
sorted
排序也比较直观,有两种:
1 2 3 4 5
| Stream<T> sorted(); Stream<T> sorted(Comparator<? super T> comparator);
|
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| List<HouseInfo> houseInfos = Lists.newArrayList( new HouseInfo(1, "恒大星级公寓", 100, 1), new HouseInfo(2, "汇智湖畔", 999, 2), new HouseInfo(3, "张江汤臣豪园", 100, 1), new HouseInfo(4, "保利星苑", 23, 10), new HouseInfo(5, "北顾小区", 66, 23), new HouseInfo(6, "北杰公寓", null, 55), new HouseInfo(7, "保利星苑", 77, 66), new HouseInfo(8, "保利星苑", 111, 12) ); houseInfos.stream().sorted((h1, h2) -> { if (h1 == null || h2 == null) return 0; if (h1.getDistance() == null || h2.getDistance() == null) return 0; int ret = h1.getDistance().compareTo(h2.getDistance()); if (ret == 0) { if (h1.getBrowseCount() == null || h2.getBrowseCount() == null) return 0; return h1.getBrowseCount().compareTo(h2.getBrowseCount()); } return ret; });
|
3 终止/消费Stream
条件测试、初级统计操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| List<Integer> list = Arrays.asList(1, 2, 3, 4, 5); System.out.println(list.stream().allMatch(e -> e > 0)); System.out.println(list.stream().anyMatch(e -> (e & 1) == 0)); System.out.println(list.stream().noneMatch(e -> e < 0)); Optional<Integer> optional = list.stream().filter(e -> e >= 4).findFirst(); optional.ifPresent(System.out::println); System.out.println(list.stream().filter(e -> e >= 4).count()); System.out.println(list.stream().min(Integer::compareTo)); System.out.println(list.stream().max(Integer::compareTo)); System.out.println(list.stream().mapToInt(i -> i).max());
|
reduce
这个词不知道怎么翻译,有人翻译为 规约
或 汇聚
。
反正就是将经过一系列转换后的流中的数据最终收集起来,收集的同时可能会反复 apply
某个 reduce函数
。
reduce()方法有以下两个重载的变体:
1 2 3 4 5 6
| T reduce(T identity, BinaryOperator<T> accumulator); <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner);
|
示例:
1 2 3 4 5 6 7 8 9
| Integer reduce = Stream.iterate(1, i -> i + 1) .limit(10) .reduce(0, (i, j) -> i + j); Optional<Integer> reduce2 = Stream.iterate(1, i -> i + 1) .limit(10) .reduce((i, j) -> i + j);
|
collect
该操作很好理解,顾名思义就是将Stream中的元素collect到一个地方。
1 2 3 4 5
| <R> R collect(Supplier<R> supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner);
|
1
| <R, A> R collect(Collector<? super T, A, R> collector);
|
Collector接口(他不是函数式接口,没法使用lambda)的关键代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| public interface Collector<T, A, R> { Supplier<A> supplier(); BiConsumer<A, T> accumulator(); BinaryOperator<A> combiner(); Function<A, R> finisher(); Set<Characteristics> characteristics(); }
|
先来看一个关于三个参数的collect()方法的例子,除非特殊情况,不然我保证你看了之后这辈子都不想用它……
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5); ArrayList<Integer> ret1 = numbers.stream() .map(i -> i * 2) .collect( () -> new ArrayList<Integer>(), (list, e) -> list.add(e), (list1, list2) -> list1.addAll(list2) ); ret1.forEach(System.out::println);
|
不使用lambda的时候,等价的代码应该是这个样子的……
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| List<Integer> ret3 = numbers.stream() .map(i -> i * 2) .collect(new Supplier<List<Integer>>() { @Override public List<Integer> get() { return new ArrayList<>(); } }, new BiConsumer<List<Integer>, Integer>() { @Override public void accept(List<Integer> list, Integer e) { list.add(e); } }, new BiConsumer<List<Integer>, List<Integer>>() { @Override public void accept(List<Integer> list1, List<Integer> list2) { list1.addAll(list2); } }); ret3.forEach(System.out::println);
|
是不是被恶心到了……
同样的,用Java调用spark的api的时候,如果没有lambda的话,比上面的代码还恶心……
顺便打个免费的广告,可以看看本大侠这篇使用各种版本实现的Spark的HelloWorld: http://blog.csdn.net/hylexus/article/details/52606540,来证明一下有lambda的世界是有多么幸福……
不过,当你理解了三个参数的collect方法之后,可以使用构造器引用和方法引用来使代码更简洁:
1 2 3 4 5 6 7 8 9
| ArrayList<Integer> ret2 = numbers.stream() .map(i -> i * 2) .collect( ArrayList::new, List::add, List::addAll ); ret2.forEach(System.out::println);
|
Collectors工具的使用(高级统计操作)
上面的三个和一个参数的collect()方法都异常复杂,最常用的还是一个参数的版本。但是那个Collector自己实现的话还是很恶心。
还好,常用的Collect操作对应的Collector都在java.util.stream.Collectors
中提供了。很强大的工具……
以下示例都是对该list的操作:
1 2 3 4 5 6 7 8 9 10
| List<HouseInfo> houseInfos = Lists.newArrayList( new HouseInfo(1, "恒大星级公寓", 100, 1), new HouseInfo(2, "汇智湖畔", 999, 2), new HouseInfo(3, "张江汤臣豪园", 100, 1), new HouseInfo(4, "保利星苑", 111, 10), new HouseInfo(5, "北顾小区", 66, 23), new HouseInfo(6, "北杰公寓", 77, 55), new HouseInfo(7, "保利星苑", 77, 66), new HouseInfo(8, "保利星苑", 111, 12) );
|
好了,开始装逼之旅 ^_^ ……
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| List<String> ret1 = houseInfos.stream() .map(HouseInfo::getHouseName).collect(Collectors.toList()); ret1.forEach(System.out::println); Set<String> ret2 = houseInfos.stream() .map(HouseInfo::getHouseName).collect(Collectors.toSet()); ret2.forEach(System.out::println); String names = houseInfos.stream() .map(HouseInfo::getHouseName).collect(Collectors.joining("_^_")); System.out.println(names); ArrayList<String> collect = houseInfos.stream() .map(HouseInfo::getHouseName) .collect(Collectors.toCollection(ArrayList::new));
|
1 2 3 4 5 6 7 8 9 10 11 12
| Optional<HouseInfo> ret3 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .collect(Collectors.maxBy((h1, h2) -> Integer.compare(h1.getBrowseCount(), h2.getBrowseCount()))); System.out.println(ret3.get()); Optional<Integer> ret4 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .map(HouseInfo::getBrowseCount) .collect(Collectors.maxBy(Integer::compare)); System.out.println(ret4.get());
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| Long total = houseInfos.stream().collect(Collectors.counting()); System.out.println(total); Integer ret5 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .collect(Collectors.summingInt(HouseInfo::getBrowseCount)); System.out.println(ret5); Integer ret6 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .map(HouseInfo::getBrowseCount).collect(Collectors.summingInt(i -> i)); System.out.println(ret6); int ret7 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .mapToInt(HouseInfo::getBrowseCount) .sum(); System.out.println(ret7);
|
1 2 3 4 5 6 7 8 9 10 11 12
| Double ret8 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .collect(Collectors.averagingDouble(HouseInfo::getBrowseCount)); System.out.println(ret8); OptionalDouble ret9 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .mapToDouble(HouseInfo::getBrowseCount) .average(); System.out.println(ret9.getAsDouble());
|
1 2 3 4 5 6 7
| DoubleSummaryStatistics statistics = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .collect(Collectors.summarizingDouble(HouseInfo::getBrowseCount)); System.out.println("avg:" + statistics.getAverage()); System.out.println("max:" + statistics.getMax()); System.out.println("sum:" + statistics.getSum());
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
| Map<Integer, List<HouseInfo>> ret10 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null) .collect(Collectors.groupingBy(HouseInfo::getBrowseCount)); ret10.forEach((count, house) -> { System.out.println("BrowseCount:" + count + " " + house); }); Map<Integer, Map<String, List<HouseInfo>>> ret11 = houseInfos.stream() .filter(h -> h.getBrowseCount() != null && h.getDistance() != null) .collect(Collectors.groupingBy( HouseInfo::getBrowseCount, Collectors.groupingBy((HouseInfo h) -> { if (h.getDistance() <= 10) return "较近"; else if (h.getDistance() <= 20) return "近"; return "较远"; }))); ret11.forEach((count, v) -> { System.out.println("浏览数:" + count); v.forEach((desc, houses) -> { System.out.println("\t" + desc); houses.forEach(h -> System.out.println("\t\t" + h)); }); });
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
| Map<Boolean, List<HouseInfo>> ret12 = houseInfos.stream() .filter(h -> h.getDistance() != null) .collect(Collectors.partitioningBy(h -> h.getDistance() <= 20)); ret12.forEach((t, houses) -> { System.out.println(t ? "较近" : "较远"); houses.forEach(h -> System.out.println("\t\t" + h)); }); Map<Boolean, Map<Boolean, List<HouseInfo>>> ret13 = houseInfos.stream() .filter(h -> h.getDistance() != null) .collect( Collectors.partitioningBy(h -> h.getDistance() <= 20, Collectors.partitioningBy(h -> h.getBrowseCount() >= 70)) ); ret13.forEach((less, value) -> { System.out.println(less ? "较近" : "较远"); value.forEach((moreCount, houses) -> { System.out.println(moreCount ? "\t浏览较多" : "\t浏览较少"); houses.forEach(h -> System.out.println("\t\t" + h)); }); });
|