Quantcast
Channel: 黑暗執行緒
Viewing all articles
Browse latest Browse all 2311

【茶包射手筆記】Dictionary.Add()效能問題調查

$
0
0

依據MSDN文件:

If Count is less than the capacity, this method approaches an O(1) operation. If the capacity must be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.

只要Dictionary資料筆數沒超過Capacity(Capacity可在建構時指定),資料新增屬O(1)操作,每次呼叫的執行速度不因Dictonary資料量多寡改變,理論上為定值。

網友Cb Atm提供了一則打破O(1)規則的奇妙案例。經我簡化改寫如下:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
namespace DictKey
{
class Program2
    {
constint CAPACITY = 20000;
publicstruct Index
        {
publicstring StkNo;
publicbyte TypeCode;
publicint Idx;
        }
static Random rnd = new Random(32767);
static List<Index> stocks =
            Enumerable.Range(0, CAPACITY).Select(
                o => new Index()
                {
                    Idx = o,
                    StkNo = rnd.Next(1000, 1011).ToString(),
                    TypeCode = (byte)rnd.Next(0, 10)
                }).ToList();
staticvoid Test1()
        {
            var dictStruct = new Dictionary<Index, int>(CAPACITY);
            Action<int> test = (count) =>
            {
                Stopwatch sw = new Stopwatch();
                dictStruct.Clear();
                sw.Start();
for (var i = 0; i < count; i++)
                {
                    dictStruct.Add(stocks[i], i);
                }
                sw.Stop();
                Console.WriteLine("{1:n0} items: {0:n0}ms",
                    sw.ElapsedMilliseconds, dictStruct.Count);
            };
            test(10000);
            test(CAPACITY);
        }
staticvoid Main(string[] args)
        {
            Test1();
            Console.Read();
        }
    }
}

程式宣告一個Index Struct作為Dictionary的TKey型別,測試時使用亂數產生兩萬筆資料(但StkNo屬性值限定介於1000-1010,TypeCode限定0-9,Idx欄位取流水號具唯一性),分別測試Dictionary<Index, int>.Add()一萬筆及兩萬筆,依O(1)理論,後者的時間應為前者兩倍,事實不然:

10,000 items: 1,993ms
20,000 items: 7,869ms

塞入兩萬筆耗時為塞入一萬筆的四倍左右,為什麼?

做了一個小改寫,移除StkNo亂數取值的範圍限制,改成StkNo = rnd.Next().ToString()(實際範圍放大成int.MinValue到int.MaxValue),得到完全不同結果!

10,000 items: 6ms
20,000 items: 5ms

除了執行速度加快數百倍,加入一萬筆與加入兩萬筆的時間不分上下。(表示Add()所耗時間已非瓶頸,差異微小到可被忽略)

再試改寫成StkNo = rnd.Next(1000,1100).ToString(),讓StkNo介於1000到1099,結果則為:

10,000 items: 247ms
20,000 items: 877ms

速度比最初版本快8-9倍,塞入兩萬筆的時間為塞入一萬筆的3.5倍。

過去對雜湊表原理了解有限,歸納可知StkNo多樣性影響效能甚鉅,但不知其所以然。爬文查到Struct作為TKey時,鍵值比對時與GetHashCode()有關,但再多理論都比不上Source Code權威及清楚明瞭。使用JustDecompile揭開Dictionary<TKey, TValue> .Add()神祕面紗:

privatevoid Insert(TKey key, TValue value, bool add)
        {
int num;
if (key == null)
            {
                ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
            }
if (this.buckets == null)
            {
this.Initialize(0);
            }
int hashCode = this.comparer.GetHashCode(key) & 2147483647;
int length = hashCode % (int)this.buckets.Length;
int num1 = 0;
for (int i = this.buckets[length]; i >= 0; i = this.entries[i].next)
            {
if (this.entries[i].hashCode == hashCode && 
this.comparer.Equals(this.entries[i].key, key))
                {
if (add)
                    {
                        ThrowHelper.ThrowArgumentException(
                              ExceptionResource.Argument_AddingDuplicate);
                    }
this.entries[i].@value = value;
                    Dictionary<TKey, TValue> dictionary = this;
                    dictionary.version = dictionary.version + 1;
return;
                }
                num1++;
            }

依其程式邏輯,Insert時(Add背後會呼叫Insert(key, value, true))先用GetHashCode()取得key參數的hashCode值,接著依Bucket分群規則(取hashCode除以Bucket個數的餘數,餘數相同者放在同一Bucket)比對同一Bucket的現存Key值,若hashCode相同且新舊Key值.Equals()也相等,依add參數決定要拋出鍵值重複錯誤還是取代現值。延伸閱讀

由此可知:

  1. hashCode可能重複,而多個hashCode可能同屬一個Bucket
  2. 每次比對時先由hashCode找出其所屬Bucket,一一比對該Bucket下是否已存在相同鍵值
  3. 比對鍵值是否相同的邏輯為 this.entries[i].hashCode == hashCode && this.comparer.Equals(this.entries[i].key, key),當hashCode相同才會進行第二階段Equals()比對。

故可推論,當hashCode重複的機率愈高,鍵值會集中在少數Bucket中,額外執行Equals()的次數愈高,勢必拖累效能。

所以新增一萬筆會耗時近2秒是因為StkNo變化太少導致Hash Code大量重複嗎?用以下段程式驗證:

staticvoid Test2()
{
    var hashCodeStats = new Dictionary<int, int>();
for (var i = 0; i < 20000; i++)
    {
        var hashCode = stocks[i].GetHashCode();
if (hashCodeStats.ContainsKey(hashCode))
            hashCodeStats[hashCode]++;
else
            hashCodeStats.Add(hashCode, 1);
    }
    Console.WriteLine("共有{0}種HashCode", hashCodeStats.Count);
}

測試結果:

StkNo = rnd.Next(1000, 1011).ToString() 兩萬筆資料只有11種HashCode
StkNo = rnd.Next(1000, 1100).ToString() 兩萬筆資料只有100種HashCode
StkNo = rnd.Next().ToString() 兩萬筆資料有兩萬種HashCode

雖然兩萬筆資料有唯一的Idx屬性,還配上不同TypeCode,但GetHashCode()似乎只跟StkNo有關,使用以下程式進一步證實這點:

staticvoid Test3()
{
    var idx1 = new Index()
    {
        StkNo = "AAAA",
        Idx = 1,
        TypeCode = 2
    };
    var idx2 = new Index()
    {
        StkNo = "AAAA",
        Idx = 100,
        TypeCode = 200
    };
    var idx3 = new Index()
    {
        StkNo = "AAAB",
        Idx = 1,
        TypeCode = 2
    };
    Console.WriteLine("{0} vs {1} vs {2}",
        idx1.GetHashCode(), 
        idx2.GetHashCode(), 
        idx3.GetHashCode());
}

結果為80968112 vs 80968112 vs -651627219,StkNo相同GetHashCode()就相同。

由以上分析,想改善新増效能有兩個解決方向:1) 改用唯一值Idx當鍵值 2) 解決Index.GetHashCode()大量重複的狀況

第一種做法可將Dictionary<Index, int>改為Dictionary<int, int>:

staticvoid Test1A()
{
    var dictStruct = new Dictionary<int, int>(CAPACITY);
    Action<int> test = (count) =>
    {
        Stopwatch sw = new Stopwatch();
        dictStruct.Clear();
        sw.Start();
for (var i = 0; i < count; i++)
        {
            dictStruct.Add(stocks[i].Idx, i);
        }
        sw.Stop();
        Console.WriteLine("{0:n0} items: {1:n0}ms {2:n0}ticks",
            dictStruct.Count,
            sw.ElapsedMilliseconds, 
            sw.ElapsedTicks);
    };
for (var i = 0; i < 5; i++)
    {
        test(10000);
        test(CAPACITY);
    }
}

修改後,一萬筆及兩萬筆的執行時間都變成0ms,速度快到測不出來,需改比ElapsedTicks。程式跑五次,排除第一次暖機期,兩萬筆的執行時間約為一萬筆的兩倍。

10,000 items: 1ms 3,471ticks
20,000 items: 0ms 2,248ticks
10,000 items: 0ms 545ticks
20,000 items: 0ms 1,127ticks
10,000 items: 0ms 561ticks
20,000 items: 0ms 1,140ticks
10,000 items: 0ms 540ticks
20,000 items: 0ms 1,082ticks
10,000 items: 0ms 573ticks
20,000 items: 0ms 1,113ticks

但修改成Dictionary<int, int>有後遺症,取得鍵值數字後需另外查表才態拿到Index型別,程式需配合改寫。還有第二種做法,我們override Index的GetHashCode()傳回更具唯一性的雜湊值(流水號整數Idx是很好的選擇),如下:

publicstruct Index
 {
publicstring StkNo;
publicbyte TypeCode;
publicint Idx;
publicoverrideint GetHashCode()
     {
return Idx;
     }
 }
staticvoid Test1()
 {
     Action<int> test = (count) =>
     {
         var dictStruct = new Dictionary<Index, int>(CAPACITY);
         Stopwatch sw = new Stopwatch();
         dictStruct.Clear();
         sw.Start();
for (var i = 0; i < count; i++)
         {
             dictStruct.Add(stocks[i], i);
         }
         sw.Stop();
         Console.WriteLine("{0:n0} items: {1:n0}ms {2:n0}ticks",
             dictStruct.Count,
             sw.ElapsedMilliseconds,
             sw.ElapsedTicks);
     };
for (var i = 0; i < 5; i++)
     {
         test(10000);
         test(CAPACITY);
     }
 }

測試結果如下,自訂GetHashCode()後,ticks數為Dictionary<int, int>的兩倍,但執行時間已加速到小於1ms,兩萬筆耗時約為一萬筆的2.5倍:

10,000 items: 3ms 11,730ticks
20,000 items: 0ms 2,292ticks
10,000 items: 0ms 1,471ticks
20,000 items: 0ms 2,035ticks
10,000 items: 0ms 1,089ticks
20,000 items: 0ms 2,749ticks
10,000 items: 0ms 1,064ticks
20,000 items: 0ms 2,247ticks
10,000 items: 0ms 1,078ticks
20,000 items: 0ms 2,677ticks

【結論】使用Struct作為Dictionary TKey,若GetHashCode()值重複性過高,將嚴重影響Add()效能,應考慮自訂GetHashCode()增加差異性加以改善,或改用string或int等單純型別以達最佳效能。


Viewing all articles
Browse latest Browse all 2311

Trending Articles