Scala 数据结构
数组Array
1
| final class Array[T](_length: Int) extends java.io.Serializable with java.lang.Cloneable
|
数组是可变的可索引集合,数组内容可变。
构造
指定长度的空数组
1 2
| scala> val arr = new Array[String](4) arr: Array[String] = Array(null, null, null, null)
|
显示指定类型
1 2
| scala> val nums = Array[Int](1,2,3) nums: Array[Int] = Array(1, 2, 3)
|
隐式推断类型
1 2 3
| scala> val subjects = Array("Hadoop","HBase","Spark") subjects: Array[String] = Array(Hadoop, HBase, Spark)
|
多维数组
1 2
| scala> var twoDimArray = Array.ofDim[Int](2,3) twoDimArray: Array[Array[Int]] = Array(Array(0, 0, 0), Array(0, 0, 0))
|
元素访问与修改
1 2 3 4 5 6 7 8 9 10 11 12 13
| scala> nums(1) res4: Int = 2
scala> subjects(2)="PySpark" scala> subjects(2) res2: String = PySpark
scala> twoDimArray(1)(1) = 1 scala> twoDimArray res5: Array[Array[Int]] = Array(Array(0, 0, 0), Array(0, 1, 0))
|
元组TupleN
元组是对多个不同数据类型对象的封装,Scala提供了TupleN类(1<=N<=22);元组内容不可变。
构造
1 2 3 4 5
| scala> val scores = Tuple3("quqingyuan", 'm', 98) scores: (String, Char, Int) = (quqingyuan,m,98)
scala> ("quqingyuan", 'm', 98) res6: (String, Char, Int) = (quqingyuan,m,98)
|
元素访问
1 2
| scala> scores._1 res8: String = quqingyuan
|
注意:索引值从1开始,不支持元素的修改。
注意
数组也可存放不同数据类型的对象,scala选择所有初始值最近的数据类型作为元素的类型。
1 2
| scala> val subjects = Array(1, 2.3f) subjects: Array[Float] = Array(1.0, 2.3)
|
相关特质
Seq/Set/Map
序列Seq(Sequence),有序,使用整数从0进行索引;
集合Set,无序,无法索引;
映射Map,可根据key进行索引。
List,Set,Map均在scala.collection.immutable包下定义,值不可变。
序列Seq
列表List
构造
1 2
| scala> val subjects = List[String]("Hadoop","HBase","Spark") subjects: List[String] = List(Hadoop, HBase, Spark)
|
空列表
1 2
| scala> Nil res0: scala.collection.immutable.Nil.type = List()
|
访问
按照索引访问
1 2
| scala> subjects(2) res5: String = Spark
|
访问头部
1 2 3
| scala> subjects.head res13: String = Hadoop
|
访问尾部
tail返回除第一个元素外的其他元素构成的列表
1 2
| scala> subjects.tail res15: List[String] = List(HBase, Spark)
|
注意:列表采用链表结构,除head和tail时间复杂度为O(1),其他索引访问都需要从头部开始遍历,时间复杂度为O(n),如要实现操作为常数时间,可以使用向量Vector。
拼接
头部拼接1
1 2
| scala> val subjectsList = "Linux"::subjects subjectsList: List[String] = List(Linux, Hadoop, HBase, Spark)
|
:: 操作符实际上是如下方法:
1 2
| scala> val subjectList2 = subjectsList.::("Java") subjectList2: List[String] = List(Java, Linux, Hadoop, HBase, Spark)
|
头部拼接2
1 2
| scala> val subjectsList3 = "Java"::"Hadoop"::"HBase"::Nil subjectsList3: List[String] = List(Java, Hadoop, HBase)
|
注意Nil空列表不可省略。
头部拼接3
1 2
| scala> "Java"+:subjects res8: List[String] = List(Java, Hadoop, HBase, Spark)
|
+:操作符实际上是如下方法:
1 2
| scala> subjects.+:("Java") res11: List[String] = List(Java, Hadoop, HBase, Spark)
|
尾部拼接
1 2
| scala> subjects:+"Flink" res9: List[String] = List(Hadoop, HBase, Spark, Flink)
|
注意:::,+:,:+操作符都会返回新对象。
两个列表拼接
1 2 3 4 5 6 7 8
| scala> val subjects1 = List[String]("Hadoop","HBase","Spark") subjects1: List[String] = List(Hadoop, HBase, Spark)
scala> val subjects2 = List("Storm","Flink") subjects2: List[String] = List(Storm, Flink)
scala> subjects1++subjects2 res31: List[String] = List(Hadoop, HBase, Spark, Storm, Flink)
|
向量Vector
1 2
| scala> val subjects = Vector("quqingyuan",'m',90) subjects: scala.collection.immutable.Vector[Any] = Vector(quqingyuan, m, 90)
|
拼接
1 2 3 4 5
| scala> val subject1 = 0+:subjects subject1: scala.collection.immutable.Vector[Any] = Vector(0, quqingyuan, m, 90)
scala> val subject1 = subjects:+"shandong" subject1: scala.collection.immutable.Vector[Any] = Vector(quqingyuan, m, 90, shandong)
|
注意Vector不支持::操作符,::是List中的成员方法。
List和Vector都是不可变的,元素不能被增加、删除、修改。List和Vector的可变版本为ListBuffer和ArrayBuffer。均在scala.collection.mutable包下定义,值可变。使用前需要先导入
1 2
| scala> import scala.collection.mutable.ListBuffer import scala.collection.mutable.ListBuffer
|
ListBuffer
构造
1 2
| scala> val subjects = ListBuffer("Hadoop","HBase","Spark") subjects: scala.collection.mutable.ListBuffer[String] = ListBuffer(Hadoop, HBase, Spark)
|
修改
追加
1 2 3 4 5 6 7 8
| scala> "Java"+:subjects res19: scala.collection.mutable.ListBuffer[String] = ListBuffer(Java, Hadoop, HBase, Spark)
scala> subjects:+"Storm" res20: scala.collection.mutable.ListBuffer[String] = ListBuffer(Hadoop, HBase, Spark, Storm)
scala> subjects+="Flink" res21: subjects.type = ListBuffer(Hadoop, HBase, Spark, Flink)
|
删除
1 2
| scala> subjects-="HBase" res22: subjects.type = ListBuffer(Hadoop, Spark, Flink)
|
可以发现
与List相同,+:和:+操作符都没有对原对象进行修改,只是返回了一个新的对象。
直接修改原对象,可以使用+=或-=操作符。
与List相同,++会拼接两个ListBuffer,但不会修改两个ListBuffer,会返回拼接后的结果。
直接在原ListBuffer上拼接另一个ListBuffer,可以使用++=操作符。
1 2 3 4 5 6 7 8 9 10 11 12 13
| scala> val subjects1 = ListBuffer("Hadoop","HBase","Spark") subjects1: scala.collection.mutable.ListBuffer[String] = ListBuffer(Hadoop, HBase, Spark)
scala> val subjects2 = ListBuffer("Storm","Flink") subjects2: scala.collection.mutable.ListBuffer[String] = ListBuffer(Storm, Flink)
scala>
scala> subjects1++subjects2 res34: scala.collection.mutable.ListBuffer[String] = ListBuffer(Hadoop, HBase, Spark, Storm, Flink)
scala> subjects1++=subjects2 res35: subjects1.type = ListBuffer(Hadoop, HBase, Spark, Storm, Flink)
|
按索引插入
1 2 3 4
| scala> subjects.insert(1,"HBase","Hive")
scala> subjects res24: scala.collection.mutable.ListBuffer[String] = ListBuffer(Hadoop, HBase, Hive, Spark, Flink)
|
按索引删除元素
会返回删除元素
1 2
| scala> subjects.remove(2) res26: String = Hive
|
ArrayBuffer
略。
等差数列
until
1 2 3 4 5 6 7 8
| scala> val nums = 1 until 7 by 2 nums: scala.collection.immutable.Range = Range 1 until 7 by 2
scala> nums(2) res45: Int = 5
scala> nums.length res46: Int = 3
|
Range
1 2
| scala> Range(1,7,2) res47: scala.collection.immutable.Range = Range 1 until 7 by 2
|
to
1 2 3 4 5
| scala> val nums = 1 to 7 by 2 nums: scala.collection.immutable.Range = Range 1 to 7 by 2
scala> nums.length res49: Int = 4
|
注意:until和to还可以应用于浮点,字符数据类型;Range作用于字符时,会根据ASCII表转为对应的数字。
1 2 3 4 5 6 7 8 9 10 11
| scala> val chars = 'a' to 'z' chars: scala.collection.immutable.NumericRange.Inclusive[Char] = NumericRange a to z
scala> chars.length res52: Int = 26
scala> chars(2) res55: Char = c
scala> Range('a','z') res56: scala.collection.immutable.Range = Range 97 until 122
|
集合Set
默认情况下,创建的是不可变Set。
Immutable
1 2 3 4 5 6 7
| scala> var subjects = Set("Hadoop","HBase","Spark") subjects: scala.collection.immutable.Set[String] = Set(Hadoop, HBase, Spark)
scala> subjects+="Flink"
scala> subjects res28: scala.collection.immutable.Set[String] = Set(Hadoop, HBase, Spark, Flink)
|
Mutable
1 2 3 4 5 6 7 8 9 10 11
| scala> import scala.collection.mutable.Set import scala.collection.mutable.Set
scala> val subjects = Set("Hadoop","HBase","Spark") subjects: scala.collection.mutable.Set[String] = Set(Hadoop, HBase, Spark)
scala> subjects+="Flink" res29: subjects.type = Set(Flink, Hadoop, HBase, Spark)
scala> subjects res30: scala.collection.mutable.Set[String] = Set(Flink, Hadoop, HBase, Spark)
|
映射Map
与List和Map一样,scala默认使用的是不可变Map。
immutable
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| var hadoopScore = Map("zhangsan"->80, "lisi"->90, "quqingyuan"->100)
val score = if (hadoopScore.contains("quqingyuan")){ hadoopScore("quqingyuan") }else{ 80 } println(score)
hadoopScore+=("wangwu"->75) println(hadoopScore)
hadoopScore("quqingyuan") = 100
|
mutable
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| val hadoopScore = Map("zhangsan" -> 80, "lisi" -> 90, "quqingyuan" -> 100)
val score = if (hadoopScore.contains("quqingyuan")){ hadoopScore("quqingyuan") }else{ 80 } println(score)
hadoopScore+=("wangwu"->75) println(hadoopScore)
hadoopScore("quqingyuan") = 99
|
迭代器Iterator
1 2 3 4 5 6
| val hadoopScore = Map("zhangsan" -> 80, "lisi" -> 90, "quqingyuan" -> 100)
val iter = Iterator(hadoopScore.keys) while (iter.hasNext){ println(iter.next()) }
|