15 Linux Bash History Expansion Examples You Should Know

4:03:00 PM 0 Comments


Bash history is very powerful. Understanding how to effectively use the bash history expansions will make you extremely productive on the Linux command line.
This article explains 15 examples that uses the following bash history expansion features:
  • Event designators – Refers to a particular command in the history. It starts with a !
  • Word designators – Refers to a particular word of a history entry. Typically this gets combined with an even designator. Even designators and word designators are separated by a colon
  • Modifiers – Modifies the result of the substitution done by the event or word designators
This article is part of our on-going Bash Tutorial Series.
As you already know, to view all the history entries, use the history command. This will display all the commands that were executed earlier along with the number for that command in the history table.
$ history
1 tar cvf etc.tar /etc/
2 cp /etc/passwd /backup
3 ps -ef | grep http
4 service sshd restart
5 /usr/local/apache2/bin/apachectl restart

Bash History Event Designators

1. Execute a specific command from the history using !n

If you’ve executed a command earlier, instead of re-typing it again, you can quickly execute it by using the corresponding number of the command in the history.
For example, to execute command #4, do the following. This will display command #4 from the history, and execute it immediately.
$ !4
service sshd restart
To execute a command that was typed 2 commands back, do the following.
$ !-2
To execute the previous command, do any one of the following:
$ !!

$ !-1
You can also press -P (if you are in the default emacs mode) to get to the previous command.
If you’ve enabled vi style editing for the command line using ‘set -o vi’, use -k to get to the previous command.

2. Execute a command with keywords using !string and !?string

You can also use keywords to execute a command from the history.
The following example will search for previous command that STARTS with the keyword “ps” and execute it. In this example, it picks up the previous command “ps -ef | grep http” and executes it.
$ !ps
ps -ef | grep http
The following example will search for previous command that CONTAINS the keyword“apache” and execute it. In this example, it picks up the previous command “/usr/local/apache2/bin/apachectl restart” and executes it.
$ !?apache
/usr/local/apache2/bin/apachectl restart

3. Replace a string from the previous command using ^str1^str2^

In the following example, first we executed the ls command to verify a file. Later we realized that we want to view the content of the file. Instead of typing the whole file name again, we can just replace the “ls” in the previous command with “cat” as shown below.
$ ls /etc/cron.daily/logrotate

$ ^ls^cat^
cat /etc/cron.daily/logrotate
Note: For additional bash history tips, refer to 15 Examples To Master Linux Command Line History. This explains how to display timestamp in the history, and how to use various history related environment variables including HISTTIMEFORMAT, HISTSIZE, HISTFILE, HISTCONTROL, and HISTIGNORE

Bash History Word Designators

Word designators are very helpful when you want to type a new command, but use the argument from one of the command that was executed earlier. Some of the examples are shown below.

4. Get the 1st argument of a command using :^

In the following example, “!cp:^” was given as an argument to “ls -l” command. “!cp:^” locates the previous command in the history that starts with “cp” and gets the 1st argument of that command.
$ cp /etc/passwd /backup

$ ls -l !cp:^
ls -l /etc/passwd
The following example gets the 1st argument from the previous command.
$ ls -l !!:^

5. Get the last argument of a command using :$

In the following example, “!cp:$” was given as an argument to “ls -l” command. “!cp:$” locates the previous command in the history that starts with “cp” and gets the last argument of that command.
$ cp /etc/passwd /backup

$ ls -l !cp:$
ls -l /backup
The following example gets the last argument from the previous command.
$ls -l !!:$

6. Get the nth argument of a command using :n

In the following example, “!tar:2″ was given as an argument to “ls -l” command. “!tar:2″ locates the previous command in the history that starts with “tar” and gets the 2nd argument of that command.
$ tar cvfz /backup/home-dir-backup.tar.gz /home

$ ls -l !tar:2
ls -l /backup/home-dir-backup.tar.gz

7. Get all the arguments from a command using :*

In the following example, “!cp:*” was given as an argument to “ls -l” command. “!cp:*” locates the previous command in the history that starts with “cp” and gets all it’s arguments.
$ cp /etc/passwd /backup

$ ls -l !cp:*
ls -l /etc/passwd /backup

8. Refer to the recently searched word using !%

As we explained above, the “!?apache” will search for the previous history command that CONTAINS the keyword “apache” and execute it.
$ /usr/local/apache2/bin/apachectl restart

$ !?apache
/usr/local/apache2/bin/apachectl restart
!% will refer to the whole word that was matched by the previous “?” search.
For example, If you’ve searched previously “?apache”, the “!%” will match the whole word “/usr/local/apache2/bin/apachectl”. Note that “/” is treated as part of one word in this context.
So, in this case, by executing the following, you can stop the apache.
$ !% stop
/usr/local/apache2/bin/apachectl stop

9. Get range of arguments from a command using x-y

In the following example, “!tar:3-5″ was given as an argument to “ls -l” command. “!tar:3-5″ locates the previous command in the history that starts with “tar” and gets the arguments from 3 through 5.
$ tar cvf home-dir.tar john jason ramesh rita

$ ls -l !tar:3-5
ls -l john jason ramesh
The following gets all the arguments from 2.
$ ls -l !tar:2-$
Please note the following:
  • !!:* Gets all the arguments from the previous command
  • !!:2* Gets all the arguments starting from 2nd argument.
  • !!:2-$ Same as above. Gets all the arguments starting from 2nd argument.
  • !!:2- Gets all the arguments starting from 2nd argument (except the last argument).

Bash History Modifers

Modifers are given after the word designators, as explained in the examples below.

10. Remove the trailing path name from a word using :h

In the following example, “!!:$:h” takes the last argument of the previous command, and removes the trailing path name. In this case, it removes the filename, and gets only the full path.
$ ls -l /very/long/path/name/file-name.txt

$ ls -l !!:$:h
ls -l /very/long/path/name

11. Remove all leading path name from a word using :t

This is exact opposite of the previous example.
In the following example, “!!:$:t” takes the last argument of the previous command, and removes all the leading path names. In this case, it gets only the file name.
$ ls -l /very/long/path/name/file-name.txt

$ ls -l !!:$:t
ls -l file-name.txt

12. Remove the file name extension from a word using :r

In the following example, “!!:$:r” takes the last argument of the previous command, and removes only the “.suffix” (which is file name extension here). In this case, it removed .txt
$ ls -l /very/long/path/name/file-name.txt

$ ls -l !!:$:r
ls -l /very/long/path/name/file-name

13. Sed like Substitution in bash history using :s/str1/str2/

Instead of using the “^original^replacement^” as we discussed earlier, we can also use a sed like substitution in the bash history as shown in the example below. This might be easy to remember. !! is to call previous command, “:s/original-string/replacement-string/” is the sed-like syntax to replace a string.
$ !!:s/ls -l/cat/
You can also use the g flag (along with s flag) to do global substitution as shown below. This is helpful when you’ve mistyped multiple word and would like to change all of them together and execute the command again.
In the following example, by mistake I’ve given “password” twice (instead of passwd).
$ cp /etc/password /backup/password.bak
To fix this, just do the following global history sed like substitution.
$ !!:gs/password/passwd/
cp /etc/passwd /backup/passwd.bak

14. Repeat the substitution quickly using :&

If you’ve already executed a bash history substitution successfuly as shown above, you can repeat the same substitution quickly again using :&.
I’ve by mistake typed “password” again instead of “passwd” in another command.
$ tar cvf password.tar /etc/password
Now, instead of retyping the command, or doing the “gs/password/passwd”, I can just use “:&”, which will reuse the last substitution. Use “:g&” for reusing the last subsitution by globally.
$ !!:g&
tar cvf passwd.tar /etc/passwd

15. Print the command without executing it using :p

This is very helpful when you are doing complex history substitution, and you want to view the final command before executing it.
In the following example, “!tar:3-:p”, doesn’t really execute the command.
Since we’ve given “:p” here, it just does the substitution and displays the new command. Once you’ve verified the bash history expansion, and if you think this is the command you intended to run, remove the “:p” and execute it again.
$ tar cvf home-dir.tar john jason ramesh rita

$ tar cvfz new-file.tar !tar:3-:p
tar cvfz new-file.tar john jason ramesh

15 Examples To Master Linux Command Line History

4:02:00 PM 0 Comments


When you are using Linux command line frequently, using the history effectively can be a major productivity boost. In fact, once you have mastered the 15 examples that I’ve provided here, you’ll find using command line more enjoyable and fun.

1. Display timestamp using HISTTIMEFORMAT

Typically when you type history from command line, it displays the command# and the command. For auditing purpose, it may be beneficial to display the timepstamp along with the command as shown below.
# export HISTTIMEFORMAT='%F %T '
# history | more
1  2008-08-05 19:02:39 service network restart
2  2008-08-05 19:02:39 exit
3  2008-08-05 19:02:39 id
4  2008-08-05 19:02:39 cat /etc/redhat-release

2. Search the history using Control+R

I strongly believe, this may be your most frequently used feature of history. When you’ve already executed a very long command, you can simply search history using a keyword and re-execute the same command without having to type it fully. Press Control+R and type the keyword. In the following example, I searched for red, which displayed the previous command “cat /etc/redhat-release” in the history that contained the word red.
# [Press Ctrl+R from the command prompt,
which will display the reverse-i-search prompt]
(reverse-i-search)`red': cat /etc/redhat-release
[Note: Press enter when you see your command,
which will execute the command from the history]
# cat /etc/redhat-release
Fedora release 9 (Sulphur)
Sometimes you want to edit a command from history before executing it. For e.g. you can search for httpd, which will display service httpd stop from the command history, select this command and change the stop to start and re-execute it again as shown below.
# [Press Ctrl+R from the command prompt,
which will display the reverse-i-search prompt]
(reverse-i-search)`httpd': service httpd stop
[Note: Press either left arrow or right arrow key when you see your
command, which will display the command for you to edit, before executing it]
# service httpd start

3. Repeat previous command quickly using 4 different methods

Sometime you may end up repeating the previous commands for various reasons. Following are the 4 different ways to repeat the last executed command.
  1. Use the up arrow to view the previous command and press enter to execute it.
  2. Type !! and press enter from the command line
  3. Type !-1 and press enter from the command line.
  4. Press Control+P will display the previous command, press enter to execute it

4. Execute a specific command from history

In the following example, If you want to repeat the command #4, you can do !4 as shown below.
# history | more
1  service network restart
2  exit
3  id
4  cat /etc/redhat-release

# !4
cat /etc/redhat-release
Fedora release 9 (Sulphur)

5. Execute previous command that starts with a specific word

Type ! followed by the starting few letters of the command that you would like to re-execute. In the following example, typing !ps and enter, executed the previous command starting with ps, which is ‘ps aux | grep yp’.
# !ps
ps aux | grep yp
root     16947  0.0  0.1  36516  1264 ?        Sl   13:10   0:00 ypbind
root     17503  0.0  0.0   4124   740 pts/0    S+   19:19   0:00 grep yp

6. Control the total number of lines in the history using HISTSIZE

Append the following two lines to the .bash_profile and relogin to the bash shell again to see the change. In this example, only 450 command will be stored in the bash history.
# vi ~/.bash_profile
HISTSIZE=450
HISTFILESIZE=450

7. Change the history file name using HISTFILE

By default, history is stored in ~/.bash_history file. Add the following line to the .bash_profile and relogin to the bash shell, to store the history command in .commandline_warrior file instead of .bash_history file. I’m yet to figure out a practical use for this. I can see this getting used when you want to track commands executed from different terminals using different history file name.
# vi ~/.bash_profile
HISTFILE=/root/.commandline_warrior
If you have a good reason to change the name of the history file, please share it with me, as I’m interested in finding out how you are using this feature.

8. Eliminate the continuous repeated entry from history using HISTCONTROL

In the following example pwd was typed three times, when you do history, you can see all the 3 continuous occurrences of it. To eliminate duplicates, set HISTCONTROL to ignoredups as shown below.
# pwd
# pwd
# pwd
# history | tail -4
44  pwd
45  pwd
46  pwd [Note that there are three pwd commands in history, after
executing pwd 3 times as shown above]
47  history | tail -4

# export HISTCONTROL=ignoredups
# pwd
# pwd
# pwd
# history | tail -3
56  export HISTCONTROL=ignoredups
57  pwd [Note that there is only one pwd command in the history, even after
executing pwd 3 times as shown above]
58  history | tail -4

9. Erase duplicates across the whole history using HISTCONTROL

The ignoredups shown above removes duplicates only if they are consecutive commands. To eliminate duplicates across the whole history, set the HISTCONTROL to erasedups as shown below.
# export HISTCONTROL=erasedups
# pwd
# service httpd stop
# history | tail -3
38  pwd
39  service httpd stop
40  history | tail -3

# ls -ltr
# service httpd stop
# history | tail -6
35  export HISTCONTROL=erasedups
36  pwd
37  history | tail -3
38  ls -ltr
39  service httpd stop
[Note that the previous service httpd stop after pwd got erased]
40  history | tail -6

10. Force history not to remember a particular command using HISTCONTROL

When you execute a command, you can instruct history to ignore the command by setting HISTCONTROL to ignorespace AND typing a space in front of the command as shown below. I can see lot of junior sysadmins getting excited about this, as they can hide a command from the history. It is good to understand how ignorespace works. But, as a best practice, don’t hide purposefully anything from history.
# export HISTCONTROL=ignorespace
# ls -ltr
# pwd
#  service httpd stop [Note that there is a space at the beginning of service,
to ignore this command from history]
# history | tail -3
67  ls -ltr
68  pwd
69  history | tail -3

11. Clear all the previous history using option -c

Sometime you may want to clear all the previous history, but want to keep the history moving forward.
# history -c

12. Subtitute words from history commands

When you are searching through history, you may want to execute a different command but use the same parameter from the command that you’ve just searched.
In the example below, the !!:$ next to the vi command gets the argument from the previous command to the current command.
# ls anaconda-ks.cfg
anaconda-ks.cfg
# vi !!:$
vi anaconda-ks.cfg
In the example below, the !^ next to the vi command gets the first argument from the previous command (i.e cp command) to the current command (i.e vi command).
# cp anaconda-ks.cfg anaconda-ks.cfg.bak
anaconda-ks.cfg
# vi  !^
vi anaconda-ks.cfg

13. Substitute a specific argument for a specific command.

In the example below, !cp:2 searches for the previous command in history that starts with cp and takes the second argument of cp and substitutes it for the ls -l command as shown below.
# cp ~/longname.txt /really/a/very/long/path/long-filename.txt
# ls -l !cp:2
ls -l /really/a/very/long/path/long-filename.txt
In the example below, !cp:$ searches for the previous command in history that starts with cp and takes the last argument (in this case, which is also the second argument as shown above) of cp and substitutes it for the ls -l command as shown below.
# ls -l !cp:$
ls -l /really/a/very/long/path/long-filename.txt

14. Disable the usage of history using HISTSIZE

If you want to disable history all together and don’t want bash shell to remember the commands you’ve typed, set the HISTSIZE to 0 as shown below.
# export HISTSIZE=0
# history
# [Note that history did not display anything]

15. Ignore specific commands from the history using HISTIGNORE

Sometimes you may not want to clutter your history with basic commands such as pwd and ls. Use HISTIGNORE to specify all the commands that you want to ignore from the history. Please note that adding ls to the HISTIGNORE ignores only ls and not ls -l. So, you have to provide the exact command that you would like to ignore from the history.
# export HISTIGNORE="pwd:ls:ls -ltr:"
# pwd
# ls
# ls -ltr
# service httpd stop

# history | tail -3
79  export HISTIGNORE="pwd:ls:ls -ltr:"
80  service httpd stop
81  history
[Note that history did not record pwd, ls and ls -ltr]

HashMap原理、源码、实践

12:56:00 PM 0 Comments

HashMap是一种十分常用的数据结构,作为一个应用开发人员,对其原理、实现的加深理解有助于更高效地进行数据存取。本文所用的jdk版本为1.5。 

使用HashMap 

《Effective JAVA》中认为,99%的情况下,当你覆盖了equals方法后,请务必覆盖hashCode方法。默认情况下,这两者会采用Object的“原生”实现方式,即: 

Java代码  收藏代码
  1. protected native int hashCode();  
  2. public boolean equals(Object obj) {  
  3.     return (this == obj);  
  4. }  


hashCode方法的定义用到了native关键字,表示它是由C或C++采用较为底层的方式来实现的,你可以认为它返回了该对象的内存地址;而缺省equals则认为,只有当两者引用同一个对象时,才认为它们是相等的。如果你只是覆盖了equals()而没有重新定义hashCode(),在读取HashMap的时候,除非你使用一个与你保存时引用完全相同的对象作为key值,否则你将得不到该key所对应的值。 

另一方面,你应该尽量避免使用“可变”的类作为HashMap的键。如果你将一个对象作为键值并保存在HashMap中,之后又改变了其状态,那么HashMap就会产生混乱,你所保存的值可能丢失(尽管遍历集合可能可以找到)。可参考http://www.ibm.com/developerworks/cn/java/j-jtp02183/ 

HashMap存取机制 

Hashmap实际上是一个数组和链表的结合体,利用数组来模拟一个个桶(类似于Bucket Sort)以快速存取不同hashCode的key,对于相同hashCode的不同key,再调用其equals方法从List中提取出和key所相对应的value。 

JAVA中hashMap的初始化主要是为initialCapacity和loadFactor这两个属性赋值。前者表示hashMap中用来区分不同hash值的key空间长度,后者是指定了当hashMap中的元素超过多少的时候,开始自动扩容,。默认情况下initialCapacity为16,loadFactor为0.75,它表示一开始hashMap可以存放16个不同的hashCode,当填充到第12个的时候,hashMap会自动将其key空间的长度扩容到32,以此类推;这点可以从源码中看出来: 

Java代码  收藏代码
  1. void addEntry(int hash, K key, V value, int bucketIndex) {  
  2.     Entry e = table[bucketIndex];  
  3.         table[bucketIndex] = new Entry(hash, key, value, e);  
  4.         if (size++ >= threshold)  
  5.             resize(2 * table.length);  
  6. }  


而每当hashMap扩容后,内部的每个元素存放的位置都会发生变化(因为元素的最终位置是其hashCode对key空间长度取模而得),因此resize方法中又会调用transfer函数,用来重新分配内部的元素;这个过程成为rehash,是十分消耗性能的,因此在可预知元素的个数的情况下,一般应该避免使用缺省的initialCapacity,而是通过构造函数为其指定一个值。例如我们可能会想要将数据库查询所得1000条记录以某个特定字段(比如ID)为key缓存在hashMap中,为了提高效率、避免rehash,可以直接指定initialCapacity为2048。 

另一个值得注意的地方是,hashMap其key空间的长度一定为2的N次方,这一点可以从一下源码中看出来: 

Java代码  收藏代码
  1. int capacity = 1;  
  2. while (capacity < initialCapacity)   
  3.     capacity <<= 1;  


即使我们在构造函数中指定的initialCapacity不是2的平方数,capacity还是会被赋值为2的N次方。 

为什么Sun Microsystem的工程师要将hashMap key空间的长度设为2的N次方呢?这里参考R.W.Floyed给出的衡量散列思想的三个标准: 


    一个好的hash算法的计算应该是非常快的
    一个好的hash算法应该是冲突极小化
    如果存在冲突,应该是冲突均匀化 


为了将各元素的hashCode保存至长度为Length的key数组中,一般采用取模的方式,即index = hashCode % Length。不可避免的,存在多个不同对象的hashCode被安排在同一位置,这就是我们平时所谓的“冲突”。如果仅仅是考虑元素均匀化与冲突极小化,似乎应该将Length取为素数(尽管没有明显的理论来支持这一点,但数学家们通过大量的实践得出结论,对素数取模的产生结果的无关性要大于其它数字)。为此,Craig Larman and Rhett Guthrie《Java Performence》中对此也大加抨击。为了弄清楚这个问题,Bruce Eckel(Thinking in JAVA的作者)专程采访了java.util.hashMap的作者Joshua Bloch,并将他采用这种设计的原因放到了网上(http://www.roseindia.net/javatutorials/javahashmap.shtml) 。 

上述设计的原因在于,取模运算在包括JAVA在内的大多数语言中的效率都十分低下,而当除数为2的N次方时,取模运算将退化为最简单的位运算,其效率明显提升(按照Bruce Eckel给出的数据,大约可以提升5~8倍) 。看看JDK中是如何实现的: 

Java代码  收藏代码
  1. static int indexFor(int h, int length) {  
  2.     return h & (length-1);  
  3. }  


当key空间长度为2的N次方时,计算hashCode为h的元素的索引可以用简单的与操作来代替笨拙的取模操作!假设某个对象的hashCode为35(二进制为100011),而hashMap采用默认的initialCapacity(16),那么indexFor计算所得结果将会是100011 & 1111 = 11,即十进制的3,是不是恰好是35 Mod 16。 

上面的方法有一个问题,就是它的计算结果仅有对象hashCode的低位决定,而高位被统统屏蔽了;以上面为例,19(10011)、35(100011)、67(1000011)等就具有相同的结果。针对这个问题, Joshua Bloch采用了“防御性编程”的解决方法,在使用各对象的hashCode之前对其进行二次Hash,参看JDK中的源码: 

Java代码  收藏代码
  1. static int hash(Object x) {  
  2.         int h = x.hashCode();  
  3.         h += ~(h << 9);  
  4.         h ^=  (h >>> 14);  
  5.         h +=  (h << 4);  
  6.         h ^=  (h >>> 10);  
  7.         return h;  
  8.     }  


采用这种旋转Hash函数的主要目的是让原有hashCode的高位信息也能被充分利用,且兼顾计算效率以及数据统计的特性,其具体的原理已超出了本文的领域。 

加快Hash效率的另一个有效途径是编写良好的自定义对象的HashCode,String的实现采用了如下的计算方法: 

Java代码  收藏代码
  1. for (int i = 0; i < len; i++) {  
  2. h = 31*h + val[off++];  
  3. }  
  4. hash = h;  


这种方法HashCode的计算方法可能最早出现在Brian W. Kernighan和Dennis M. Ritchie的《The C Programming Language》中,被认为是性价比最高的算法(又被称为times33算法,因为C中乘数常量为33,JAVA中改为31),实际上,包括List在内的大多数的对象都是用这种方法计算Hash值。 

另一种比较特殊的hash算法称为布隆过滤器,它以牺牲细微精度为代价,换来存储空间的大量节俭,常用于诸如判断用户名重复、是否在黑名单上等等,可以参考李开复的数学之美系列第13篇(http://googlechinablog.com/2006/08/blog-post.html) 

Fail-Fast机制 

众所周知,HashMap不是线程安全的集合类。但在某些容错能力较好的应用中,如果你不想仅仅因为1%的可能性而去承受hashTable的同步开销,则可以考虑利用一下HashMap的Fail-Fast机制,其具体实现如下: 

Java代码  收藏代码
  1. Entry nextEntry() {   
  2. if (modCount != expectedModCount)  
  3.     throw new ConcurrentModificationException();  
  4.                      ……  
  5. }  


其中modCount为HashMap的一个实例变量,并且被声明为volatile,表示任何线程都可以看到该变量被其它线程修改的结果(根据JVM内存模型的优化,每一个线程都会存一份自己的工作内存,此工作内存的内容与本地内存并非时时刻刻都同步,因此可能会出现线程间的修改不可见的问题) 。使用Iterator开始迭代时,会将modCount的赋值给expectedModCount,在迭代过程中,通过每次比较两者是否相等来判断HashMap是否在内部或被其它线程修改。HashMap的大多数修改方法都会改变ModCount,参考下面的源码: 
Java代码  收藏代码
  1. public V put(K key, V value) {  
  2.     K k = maskNull(key);  
  3.         int hash = hash(k);  
  4.         int i = indexFor(hash, table.length);  
  5.         for (Entry e = table[i]; e != null; e = e.next) {  
  6.             if (e.hash == hash && eq(k, e.key)) {  
  7.                 V oldValue = e.value;  
  8.                 e.value = value;  
  9.                 e.recordAccess(this);  
  10.                 return oldValue;  
  11.             }  
  12.         }  
  13.         modCount++;  
  14.         addEntry(hash, k, value, i);  
  15.         return null;  
  16.     }  


以put方法为例,每次往HashMap中添加元素都会导致modCount自增。其它诸如remove、clear方法也都包含类似的操作。 
从上面可以看出,HashMap所采用的Fail-Fast机制本质上是一种乐观锁机制,通过检查状态——没有问题则忽略——有问题则抛出异常的方式,来避免线程同步的开销,下面给出一个在单线程环境下发生Fast-Fail的例子: 

Java代码  收藏代码
  1. class Test {    
  2.     public static void main(String[] args) {               
  3.         java.util.HashMap map=new java.util.HashMap();    
  4.        map.put(new Object(), "a");    
  5.        map.put(new Object(), "b");    
  6.        java.util.Iterator it=map.keySet().iterator();    
  7.        while(it.hasNext()){    
  8.            it.next();    
  9.            map.put("""");         
  10.         System.out.println(map.size());    
  11.     }    
  12. }  


运行上面的代码会抛出java.util.ConcurrentModificationException,因为在迭代过程中修改了HashMap内部的元素导致modCount自增。若将上面代码中 map.put(new Object(), "b") 这句注释掉,程序会顺利通过,因为此时HashMap中只包含一个元素,经过一次迭代后已到了尾部,所以不会出现问题,也就没有抛出异常的必要了。 
在通常并发环境下,还是建议采用同步机制。这一般通过对自然封装该映射的对象进行同步操作来完成。如果不存在这样的对象,则应该使用 Collections.synchronizedMap 方法来“包装”该映射。最好在创建时完成这一操作,以防止意外的非同步访问。 

LinkedHashMap 

遍历HashMap所得到的数据是杂乱无章的,这在某些情况下客户需要特定遍历顺序时是十分有用的。比如,这种数据结构很适合构建 LRU 缓存。调用 put 或 get 方法将会访问相应的条目(假定调用完成后它还存在)。putAll 方法以指定映射的条目集合迭代器提供的键-值映射关系的顺序,为指定映射的每个映射关系生成一个条目访问。Sun提供的J2SE说明文档特别规定任何其他方法均不生成条目访问,尤其,collection 集合类的操作不会影响底层映射的迭代顺序。 

LinkedHashMap的实现与 HashMap 的不同之处在于,前者维护着一个运行于所有条目的双重链接列表。此链接列表定义了迭代顺序,该迭代顺序通常就是集合中元素的插入顺序。该类定义了header、before与after三个属性来表示该集合类的头与前后“指针”,其具体用法类似于数据结构中的双链表,以删除某个元素为例: 

Java代码  收藏代码
  1. private void remove() {  
  2.        before.after = after;  
  3.        after.before = before;  
  4. }  


实际上就是改变前后指针所指向的元素。 

显然,由于增加了维护链接列表的开支,其性能要比 HashMap 稍逊一筹,不过有一点例外:LinkedHashMap的迭代所需时间与其的所包含的元素成比例;而HashMap 迭代时间很可能开支较大,因为它所需要的时间与其容量(分配给Key空间的长度)成比例。一言以蔽之,随机存取用HashMap,顺序存取或是遍历用LinkedHashMap。 

LinkedHashMap还重写了removeEldestEntry方法以实现自动清除过期数据的功能,这在HashMap中是无法实现的,因为后者其内部的元素是无序的。默认情况下,LinkedHashMap中的removeEldestEntry的作用被关闭,其具体实现如下: 

Java代码  收藏代码
  1. protected boolean removeEldestEntry(Map.Entry eldest) {  
  2.     return false;  
  3. }  


可以使用如下的代码覆盖removeEldestEntry: 

Java代码  收藏代码
  1. private static final int MAX_ENTRIES = 100;  
  2.   
  3. protected boolean removeEldestEntry(Map.Entry eldest) {  
  4.     return size() > MAX_ENTRIES;  
  5. }  


它表示,刚开始,LinkedHashMap中的元素不断增长;当它内部的元素超过MAX_ENTRIES(100)后,每当有新的元素被插入时,都会自动删除双链表中最前端(最旧)的元素,从而保持LinkedHashMap的长度稳定。 

缺省情况下,LinkedHashMap采取的更新策略是类似于队列的FIFO,如果你想实现更复杂的更新逻辑比如LRU(最近最少使用) 等,可以在构造函数中指定其accessOrder为true,因为的访问元素的方法(get)内部会调用一个“钩子”,即recordAccess,其具体实现如下: 

Java代码  收藏代码
  1. void recordAccess(HashMap m) {  
  2.     LinkedHashMap lm = (LinkedHashMap)m;  
  3.     if (lm.accessOrder) {  
  4.         lm.modCount++;  
  5.         remove();  
  6.         addBefore(lm.header);  
  7.     }  
  8. }  


上述代码主要实现了这样的功能:如果accessOrder被设置为true,则每次访问元素时,都将该元素移至headr的前面,即链表的尾部。将removeEldestEntry与accessOrder一起使用,就可以实现最基本的内存缓存,具体代码可参考http://bluepopopo.iteye.com/blog/180236。 

WeakHashMap 

99%的JAVA教材教导我们不要去干预JVM的垃圾回收机制,但JAVA中确实存在着与其密切相关的四种引用:强引用、软引用、弱引用以及幻象引用。 

JAVA中默认的HashMap采用的是采用类似于强引用的强键来管理的,这意味着即使作为key的对象已经不存在了(指没有任何一个引用指向它),也仍然会保留在HashMap中,在某些情况下(例如内存缓存)中,这些过期的条目可能会造成内存泄漏等问题。 

WeakHashMap采用的策略是,只要作为key的对象已经不存在了(超出生命周期),就不会阻止垃圾收集器清空此条目,即使当前机器的内存并不紧张。不过,由于GC是一个优先级很低的线程,因此不一定会很快发现那些只具有弱引用的对象,除非你显示地调用它,可以参考下面的例子: 

Java代码  收藏代码
  1. public static void main(String[] args) {  
  2.     Mapmap = new WeakHashMap();  
  3.     map.put(new String("Alibaba"), "alibaba");  
  4.     while (map.containsKey("Alibaba")) {  
  5.         try {  
  6.             Thread.sleep(500);  
  7.          } catch (InterruptedException ignored) {  
  8.          }  
  9.          System.out.println("Checking for empty");  
  10.          System.gc();  
  11.     }  


上述代码输出一次Checking for empty就退出了主线程,意味着GC在最近的一次垃圾回收周期中清除了new String(“Alibaba”),同时WeakHashMap也做出了及时的反应,将该键对应的条目删除了。如果将map的类型改为HashMap的话,由于其内部采用的是强引用机制,因此即使GC被显示调用,map中的条目依然存在,程序会不断地打出Checking for empty字样。另外,在使用WeakHashMap的情况下,若是将 

Java代码  收藏代码
  1. map.put(new String("Alibaba"), "alibaba");   


改为 

Java代码  收藏代码
  1. map.put("Alibaba""alibaba");   


程序还是会不断输出Checking for empty。这与前面我们分析的WeakHashMap的弱引用机制并不矛盾,因为JVM为了减小重复创建和维护多个相同String的开销,其内部采用了蝇量模式(《JAVA与模式》),此时的“Alibaba”是存放在常量池而非堆中的,因此即使没有对象指向“Alibaba”,它也不会被GC回收。弱引用特别适合以下对象:占用大量内存,但通过垃圾回收功能回收以后很容易重新创建。 

介于HashMap和WeakHashMap之中的是SoftHashMap,它所采用的软引用的策略指的是,垃圾收集器并不像其收集弱可及的对象一样尽量地收集软可及的对象,相反,它只在真正 “需要” 内存时才收集软可及的对象。软引用对于垃圾收集器来说是一种“睁一只眼,闭一只眼”方式,即 “只要内存不太紧张,我就会保留该对象。但是如果内存变得真正紧张了,我就会去收集并处理这个对象。” 就这一点看,它其实要比WeakHashMap更适合于实现缓存机制。遗憾的是,JAVA中并没有实现相关的SoftHashMap类(Apache和Google提供了第三方的实现),但它却是提供了两个十分重要的类java.lang.ref.SoftReference以及ReferenceQueue,可以在对象应用状态发生改变是得到通知,可以参考com.alibaba.common.collection.SofthashMap中processQueue方法的实现: 

Java代码  收藏代码
  1. private ReferenceQueue queue = new ReferenceQueue();  
  2. ValueCell vc;  
  3. Map hash = new HashMap(initialCapacity, loadFactor);  
  4. ……  
  5. while ((vc = (ValueCell) queue.poll()) != null) {  
  6. if (vc.isValid()) {  
  7.           hash.remove(vc.key);  
  8.            } else {  
  9.              valueCell.dropped--;  
  10.            }  
  11. }  
  12. }  


processQueue方法会在几乎所有SoftHashMap的方法中被调用到,JVM会通过ReferenceQueue的poll方法通知该对象已经过期并且当前的内存现状需要将它释放,此时我们就可以将其从hashMap中剔除。事实上,默认情况下,Alibaba的MemoryCache所使用的就是SoftHashMap。