密度圖呈現的是機率分配,與直方圖呈現次數分配是同樣的概念,所以密度圖與直方圖其實是同一種圖表,也因此在R的繪圖指令中,只要將hist()的freq參數設定為FALSE,y軸單位就改為機率而非計數。R Base的密度圖指令是plot(density(x)),ggplot2的密度圖指令是geom_density()。由於是相同概念,所以直方圖都可以改成密度圖來表示。
密度圖
繪製密度圖前,必須先用density()將次數轉換為機率密度函數,再用plot()繪製密度線圖。如果要填滿整個機率函數可以用polygon()。
> data<-data.frame(value=rnorm(100))
> value<-density(data$value)
> plot(value, col="aquamarine4", main="Density Chart", xlab="Value", ylab="Probability")
> polygon(value, col="aquamarine4", border="aquamarine4")
ggplot密度圖指令是geom_density(),作法與直方圖geom_histogram()大同小異。
> p<-ggplot(data, aes(x=value))+
+ geom_density(fill="aquamarine4", color="aquamarine4", alpha=0.6)+
+ theme_minimal()+
+ ggtitle("ggplot Density Chart")
群組密度圖
比較2個以上群組的機率密度,一樣透過R Base的par(mfrow())來分割畫面。
> gender<-c(rep("Male",times=100), rep("Female", times=100))
> score<-c(rnorm(100, mean=0, sd=1), rnorm(100, mean=3, sd=2))
> data<-data.frame(gender, score)
> library(dplyr)
> Female<-filter(data, gender=="Female")
> Male<-filter(data, gender=="Male")
> par(mfrow=c(1,2))
> plot(density(Female$score), col="aquamarine4", main="Female", xlab="score", ylab="prob")
> polygon(density(Female$score), col="aquamarine4", border="aquamarine4")
> plot(density(Male$score), col="aquamarine3", main="Male", xlab="score", ylab="prob")
> polygon(density(Male$score), col="aquamarine3", border="aquamarine3")
將直方圖的geom_histogram()改成geom_density()配合facet_wrap()分割畫面,就能在ggplot2畫出群組密度圖。
> gender<-c(rep("Male",times=100), rep("Female", times=100))
> score<-c(rnorm(100, mean=0, sd=1), rnorm(100, mean=3, sd=2))
> data<-data.frame(gender, score)
> p<-ggplot(data, aes(x=score, fill=gender))+
+ geom_density()+
+ scale_fill_manual(values=c("aquamarine4", "aquamarine3"))+
+ theme_minimal()+
+ theme(legend.position="none")+
+ facet_wrap(~gender)
> p
疊合密度圖
R Base的疊圖作法與直方圖有點不同,plot()沒有add參數,所以沒辦法用add=T來創造疊圖,但我們可以直接用polygon()指令來完成疊圖。
> male<-c(rnorm(100, mean=0, sd=1))
> female<-c(rnorm(100, mean=3, sd=2))
> plot(density(female), col=rgb(0.4,0.8,0.67,0.6), ylim=c(0,0.5), bty="l", xlab="score", ylab="prob", main="")
> polygon(density(male), col=rgb(0.27,0.55,0.45, 0.6))
> polygon(density(female), col=rgb(0.4,0.8,0.67,0.6))
> legend("topright", c("Male", "Female"), bty="n", fill=c(rgb(0.27,0.55,0.45,0.6),rgb(0.4,0.8,0.67,0.6)))
ggplot密度圖與直方圖的疊圖做法都一樣。
> gender<-c(rep("Male",times=100), rep("Female", times=100))
> score<-c(rnorm(100, mean=0, sd=1), rnorm(100, mean=3, sd=2))
> data<-data.frame(gender, score)
> p<-ggplot(data, aes(x=score, fill=gender))+
+ geom_density(alpha=0.5, position="identity")+
+ scale_fill_manual(values=c("aquamarine4", "aquamarine3"))+
+ theme_minimal()+
+ theme(legend.title=element_blank())
> p
鏡像密度圖
鏡像密度圖的作法與直方圖相同,要記得用density()轉換為密度函數,y軸的長度是0-1之間,這裡取0-0.5。
> male<-c(rnorm(100, mean=0, sd=1))
> female<-c(rnorm(100, mean=3, sd=2))
> par(mfrow=c(2,1))
> par(mar=c(0,5,1,1))
> plot(density(male), main="", xlab="", ylab="prob", xaxt="n", ylim=c(0,0.5), bty="l", col="aquamarine4")
> polygon(density(male), col="aquamarine4", border="aquamarine4")
> legend("topright", c("Male"), bty="n", fill="aquamarine4")
> par(mar=c(3,5,0,1))
> plot(density(female), main="", xlab="score", ylab="prob", ylim=c(0.5,0), bty="l", col="aquamarine3")
> polygon(density(female), col="aquamarine3", border="aquamarine3")
> legend("bottomright", c("Female"), bty="n", fill="aquamarine3")
ggplot作圖時則要注意將y軸本來用count為單位,改成density。
> male<-c(rnorm(100, mean=0, sd=1))
> female<-c(rnorm(100, mean=3, sd=2))
> data<-data.frame(male, female)
> p<-ggplot(data, aes(x=x))+
+ geom_density( aes(x=male, y=..density..), fill="aquamarine4", color="grey", alpha=0.6)+
+ geom_label( aes(x=5, y=0.25, label="Male"), color="aquamarine4")+
+ geom_density( aes(x=female, y=-..density..), fill="aquamarine3", color="grey", alpha=0.6)+
+ geom_label( aes(x=5, y=-0.25, label="Female"), color="aquamarine3")+
+ xlab("score")+
+ ylab("prob")+
+ theme_minimal()+
+ ggtitle("ggplot Mirror Density Chart")
> p