--- The hyper sequencer ---

♪Do-Re-Mi-Fa-Sort

Sorry, Caution

　Complete content is written with the Japanese edition, but the English version is simplified.

　And the English version uses the machine translation.

　Possibly an inappropriate expression and a meaning may not come appropriately.

　Therefore, please think that this article is for tentativeness under construction. ( but always under construction ....... )

♪Do-Re-Mi-Fa-Sort

　The quick sort spreads with high efficiency widely worldwide.

　When a specific condition gathers, there is the sort technique that is higher-speed than quick sort. However, as for the quick sort, the world is generally fastest for the sort method usable widely and it lasts more than 50 years and it is wide all over the world and, regardless of a condition, has the used results.

　Speaking of a sort, it is quick sort. In the world of the programming, it is classic.

　However, it is late and may be irritated though there is no bounds to human greed and uses the quick sort. Oops, "human" is a selfish animal.

　But quick sort has a weak point. There being the data line that processing becomes extremely slow.

　It is faster than the quick sort that the writer hangs time of approximately one month and remodels it and rolls it up and brought about when the ♪Do-Re-Mi-Fa-Sort cannot solve a problem of the neighborhood and works and is the high speed sort technique that let high speed performance not to be defeated by data line such as median killers match. First "♪" is the name, too.

　♪Do-Re-Mi-Fa-Sort faster than quick sort, it may be said that it is the world fastest sort technique as a use for public.

　Because ♪Do-Re-Mi-Fa-Sort is quick sort, too.

Example of the execute time

　I performed the test with a computer of windows 7.( Intel Core i7 980X Extream Edition 3.33GHz )

　Record format is

struct SortRecStruct
{
DWORD Key;
void* Value;
};

　In the win32 environment, record length is 8 bytes, and the x64 environment, record length is 16 bytes.

　The example win32 environment. Record count is 50 million ( 50, 000,000 ) cases. A unit of number is milli second. For a test, turbo boost removes movement. It becomes slightly late.

Random32 Random32R Random15 Random15R Forward Reverse Constant Med Killer

qsort 12027.68 12012.08 6801.64 6739.24 3369.62 3572.42 280.80 20358.13

QuickSort 5881.24 5881.24 3837.62 3837.62 1170.01 1185.61 1575.61 6021.64

QuickSort2 5740.84 5725.24 3822.02 3853.22 1014.01 1029.61 1560.01 5834.44

QuickSortP 5522.44 5538.04 3603.62 3634.82 1014.01 1014.01 1357.21 5709.64

DoReMiFaSort 4648.83 4633.23 2730.02 2714.42 748.80 764.40 31.20 1950.01

　The median killer which is a natural enemy of the quick sort is handled fast. ♪Do-Re-Mi-Fa-Sort loves a median killer !?

　"qsort" is a thing using the standard qsort function.
　"QuickSort" was based on a common list placed in various sites. A type (thing performing it by a function name called med3) to choose from top, center of the data area, three places of the end when I choose the pivot.
　"QuickSort2" remodeled "QuickSort".Some movement becomes fast.
　"QuickSortP" remodeled "QuickSort2". It is late as the array type, I switch to a pointer type.

　When I rewrite this numerical value in the ratio when I assumed speed of the quick sort 100% ( unit is a percentage )

Random32 Random32R Random15 Random15R Forward Reverse Constant Med Killer

qsort 204.51 204.24 177.24 175.61 288.00 301.32 17.82 338.08

QuickSort 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

QuickSort2 97.61 97.35 99.59 100.41 86.67 86.84 99.01 96.89

QuickSortP 93.90 94.16 93.90 94.72 86.67 85.53 86.14 94.82

DoReMiFaSort 79.05 78.78 71.14 70.73 64.00 64.47 1.98 32.38

　Next, x64 case. Unit is milli second. ( record length is 16 bytes ).

Random32 Random32R Random15 Random15R Forward Reverse Constant Med Killer

qsort 11980.88 11980.88 6848.44 6817.24 3229.22 3525.62 265.20 20935.33

QuickSort 6255.64 6255.64 4258.83 4258.83 1341.61 1404.01 2012.41 7534.85

QuickSort2 6068.44 6068.44 4212.03 4227.63 1232.41 1294.81 1981.21 7425.65

QuickSortP 5460.04 5444.43 3666.02 3697.22 1185.61 1216.81 1669.21 7160.45

DoReMiFaSort 4820.43 4820.43 2854.82 2854.82 951.61 982.81 78.00 2308.81

　And percentage table ( unit is a percentage )

Random32 Random32R Random15 Random15R Forward Reverse Constant Med Killer

qsort 191.52 191.52 160.81 160.07 240.70 251.11 13.18 277.85

QuickSort 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00

QuickSort2 97.01 97.01 98.90 99.27 91.86 92.22 98.45 98.55

QuickSortP 87.28 87.03 86.08 86.81 88.37 86.67 82.95 95.03

DoReMiFaSort 77.06 77.06 67.03 67.03 70.93 70.00 3.88 30.64

Quick3 sort

　I expressed the real nature of ♪Do-Re-Mi-Fa-Sort with quick sort, but the technique called the Quick3 sort is really a base. This is technique to be called 3 way partition or 3 way segmentation. The words are difficult, but are the method to separate to three of data smaller than a pivot, data bigger than data, a pivot having a key same as pivoting.

Introduction article of Quick3 which developer oneself wrote ( by Mr.Robert Sedgewick and Mr.Jon Bentley. )

　In the quick sort, the data having the key to value same as a pivot exist in both in the group which should be bigger than a pivot in the group which should be smaller than a pivot. It is a correct answer to express when there may be it in both exactly.

　But there is only one pivot if it is data without an overlap level. May if sentence and the swap that are unnecessary for such data become slow adversely when I handle it?　wu-u-u-u-u-n... When there is an overlap level, there are few cases to be full of pivot data, and it should be often said with not more than several. Therefore it may be honest substance that the technique of Quick3 does not have popularity enough although it is announced.

　On the other hand, the writer gave various tests for a fast method. Not only the quick sort but also other methods or the original method that I found by oneself are included. The item diverges into many branches whether you cannot employ XMM register effectively whether an L1 cache is not available positively. I read the article that was not interesting that around 5% of speedup could expect if I made pivoting three by multi-pivoting. However, there was little transfer number of times and finished that I was based on quick sort how and was going to come to the highest-speed conclusion.

　But processing was a conclusion to become fast if there was a condition and could call the ratio after the data separation as far as there were few even thing and one that 50% were near because the next summons number of times largely decreased.

　Because it comes from it on the same side of data whether 50% of ratios become near, in a sense it will be that there is the luck. But I seem to be able to make insignificant efforts by devising how to choose pivoting.

　However, it can come true when the contents "which even one can call as far as there is few it" use Quick3. In the normal quick sort, the data of the pivot are within the range as next sort data depending on the form of the source, too. But it is right a good condition because a summons is taken when I use Quick3 as far as I excluded the data of the pivot. But a laborer is necessary for Quick3 of the original only a little to realize it.

　After all the median killer is the data which I do not like, and, in such Quick3, transaction speed becomes largely slow. Oh my god.

Point of the speed of ♪Do-Re-Mi-Fa-Sort

　Therefore it is the ♪Do-Re-Mi-Fa-Sort technique that what I devised to solve a problem becomes the name of the title. When I choose the pivot, this does not adopt a central value from three places and adopts the median from five places of values. I do not choose the central value simply then from five places and sort five places intentionally.

　I invalidate even the data line such as the median killer by sorting it intentionally and can change it into the state that it is convenient for one's processing.

　With these specific five places (not any place), someone may name it A, B, C, D, E, but there is not too relish. Therefore called the "do" "re" "mi" "fa" "so" it is the origin of the name that named. But I do not completely sort all five places and I omit the last "do" and "re" and speed up. Because the median is at the position of the "mi" by all means, I adopt the value as pivoting. I call it "Do-Mi-So Selector" personally.

　Depending on processing system, I seem to do that I choose the central value from nine candidates, but, as is expected, an overhead is too big and becomes late when nine do predisposal. Five became the just right speed.

　The next part asking to come has not been yet settled as far as I do not include pivot data only in this. When there is overlap pivoting as the original of Quick3, the part which processing efficiency decreases only a little is included.

　Therefore I confirm the range of the continuity of overlap data before handling Quick3 and open the range of the pivot beforehand if a value next to the pivot position is a pivot level. 3 beautiful way partition is finally completed by doing it this way. When overlap data sort many data at the same time, I am effective in letting you stop processing immediately when you discovered a large quantity of overlap data. When it is Visual C++, it is blocked one only while sentence.

　It is not necessary for the Forward data to sort it. Therefore even quick sort should become the same transaction speed in, ♪Do-Re-Mi-Fa-Sort. Besides, ♪Do-Re-Mi-Fa-Sort will become originally later than quick sort because I will take an unnecessary step in the beginning. However, actually, it is finished faster than quick sort. For processing of this neighborhood, I give influence. Of course there is the effect of the hybrid constitution, too. When there is much number of the records, I only separate a range smaller than a pivot and a range bigger than a pivot properly and am that performance changes. Take it off from next processing whether include even pivoting of only one in next processing; is meant.

　In the quick sort, include pivot data itself in the next standing in line range.
　( Because there is not the "reserved seat " of the pivot, it is not revealed where it is stored )

+0 +1 +2 +3 +4 +5 +6 +7 +8 +9

small small small large pivot large large pivot large large

// next calling procedure: Quick( Array, 0, 2 ); Quick( Array, 3, 9 ); // with 2 pivot

　In ♪Do-Re-Mi-Fa-Sort, exclude the pivot data from the next standing in line range surely.　
( Using "reserved seat" of pivot )

　It is not necessary to include the pivot data in the next range if the retracted position of the pivot is correct.

　Because should invest only the range that is smaller than a pivot and a big range in a next request.

+0 +1 +2 +3 +4 +5 +6 +7 +8 +9

small small small pivot pivot large large large large large

// next calling procedure: Quick( Array, 0, 2 ); Quick( Array, 5, 9 ); // lost 2 pivot ( Intention )

　For example, it is said that it was almost separated at the center when it is 100 records, and there are not overlap data.

　Total the second handling of quick sort ranges becomes 100 cases of 50 + 50 and agrees with the number of the request records of the cause cause.
　Even if processing advances, and the range of the summons becomes the small range so much; total of the processing range as 100.
　In contrast, the total of the second handling of, ♪Do-Re-Mi-Fa-Sort range becomes 99 cases of 49 + 50 and one case is advantageous and is slightly glad.

　The total of the processing range becomes 97 cases when it becomes the third and three cases are advantageous and are a little gladder.

　In other words processing numbers out of the object increase steadily so as to advance if processing advances.

　When there is much number of pivot by overlap data, the next summons range becomes smaller and is convenient.

　Because the median killer is overlap data, rather high-speed processing is possible; come to love it.

　When there is many it, it is called or does not know the number of the records tens of millions of times after this.

　However, it is not within the range to the very end when I exclude the pivoting of, ♪Do-Re-Mi-Fa-Sort from the target range once.

　I increase steadily so that a number excluding it advances if processing advances.

　The story that should only change this into a genuine article from the condition of the supposition.

　The theory is very simple, but implementation is extremely difficult. It is necessary to move pivoting fast at the correct position.

　I should design Quck3 based on this theory, but do not become as good as requested because implementation is incomplete.

　It is available if I reorganize it for the movement of data of Small custom Quick3, only just a little !!!

　But, in fact, the answer is simple.

　I divide it into two of a group smaller than a pivot and the group bigger than a pivot, and, in fact, the theory to handle each range is original idealism itself of the quick sort.

　However, the ideal must put a description of unnecessary processing if I implement it though I am beautiful.

　Precious rapidity is lost when I accept unnecessary processing. Therefore I do not do the unnecessary thing　―　The talk that I have only given up on the way. Did the judgment "to become slow" end it by an impracticable theory whether it was really the judgment that I tried?

　Eventually, the quick sort that I used so far came to use the list of sources which made move that it was not an original ideal of the quick sort. Though there was not it and was near, there was the part which did not meet a theory somewhere only for a theory of the quick sort. I am slightly sorry that I think so. But a list of sources is short, and there is the effect finishing. It is so, and there is the meaning to be OK with few register variables visually.

　In other words a sort to work according to the ideal theory of the quick sort passed more than 50 years, and finally came out; is talked. The thing which adds a unique pivot selector to the quick sort that finally came out is, ♪Do-Re-Mi-Fa-Sort. Though it should be studied all over the world, I did not appear so far for some reason. It is so very mysterious.

　If Quick3 is performed a screw of in the last; is fastest at that point the world; should have been sorted, and should have attracted attention as the technique that completely followed a theory of the quick sort. Precious ! Quick3.

　I think that there was not the time that they are busy with work, and is enough.

　So, ♪Do-Re-Mi-Fa-Sort is

Small custom Quick3
+
Original pivot selector

　And

It may be the genuine quick sort that realized a beautiful theory of the quick sort for the first time.

　

Diagram:

pivot selector --> pivot range adjuster --> Quick3

　It is a digression, but can process multi-thread in peace even if I handle the remainder in multi-thread because data smaller than a pivot and data bigger than a pivot are completely separated because the part which an address is piled up is impossible.

　♪Do-Re-Mi-Fa-Sort is fast in the same state, but does it for the hybrid constitution with the Insertion sort when I change it to Insertion sort when the number of the records decreased because about 100ms degree tended to become fast after all. But I usually change the remainder such as eight at 16 degree, but a processing part of, ♪Do-Re-Mi-Fa-Sort sets threshold counter of the change to 48 by the relations that an overhead is big. In addition, I originally take off original processing at the age of the number of the collected small records and let a source list be refreshing because I would adopt Insertion sort (Oh no, I see it enough complicatedly)…).

　In addition, it is a list of sources, but describes it with a non-recurrence summons model to prepare for stack by paying own expenses to be for win32, and to reduce the overhead of the recurrence call. On the other hand, the x64 use is described with a recurrence call model because an argument is the specifications called the register ferry, and the person with the recurrence call on is quick in movement speed. Both may be methods having one what I try if your system is not Windows. But, please rearrange the securing of memory function for stack for win32 in an appropriate function because you use the "HeapAlloc" function of Windows not "malloc" function. In addition, I can reduce one if sentence by the non-recurrence type. What I look for what if sentence decreases may be interesting.

　With the non-recurrence type, there is the characteristic that there become extremely few consumption levels of the stack. Even a normal recurrence summons model becomes few consumptions than the general quick sort. When circumstances, anyway, to want to hold down a consumption of the stack use the non-recurrence type though the speed is important once, a high speed sort can realize enough even a stack domain of only several kilobytes. Please analyze a list of sources of the non-recurrence type by yourself why you seem to be. In a list of ordinary sources, the contents with the big meaning play hide-and-seek.

　In the site of this homepage, the extension of an available file name is limited. Because you place it in a text file, approve the list of sources.

DoReMiFaSort.txt　stdafx.txt

　For transaction speed improvement, I do it on the basis of pointer type notation. Insertion sort to use inside becomes the pointer type.
　With the independent stack type for win32 called label "IncStack", and "subtract" stack pointer following last label. Because the stack is not working area and uses the genuine stack domain at the assembler level, decrement is reversed to increment. Movement on Visual C++ is reversed to a label name by the trace. In other words I perform the basic development with an assembler, and it means that I transplanted it adversely in Visual C++. When some movement make a serious thing, the which transplanted it in Visual C++ after I make it with an assembler is early clearly and may make it. It depends on a person.
　E flat and F sharp position data become the small factor of the speed performance in the case of Quick3, you should add a hand. This is a phenomenon peculiar to Quick3. I do not add a hand in the public source.

　I did not develop it for self-satisfaction.

　Even one second is two seconds, but there is a person requiring who wants to carry out even a little fast. I answered the demand only a little.

　I answer the voice of the user. The user showed a smile. When the smile of the user was seen, it is the best joy.

　The "technique" of the self-satisfaction is not necessary. The "technique" is for another person. It is for a smile.

　Many researchers study a high applied article. A sort is fundamental researches than it. It was too basic, and the person who studied it has disappeared. In the coming times, a general individual may perform the basic study.

　I hope, If ♪Do-Re-Mi-Fa-Sort is helpful for a human being hoping for even slightly fast processing, I am glad.

　Because I am black cat.

Special thanks

　In this sort technique, a characteristic to hope for by Quick3 was provided.

　It is the story that only demonstrated high-speed performance of Quick3 that I went.

　I thank Mr.Robert Sedgewick and Mr.Jon Bentley, who I develop Quick3, and were shown gratis.

　Thank you very much.

　And Of course Mr. C.A.R Hoare too.

contents( Japanese Language )

Top Page caution Bubble Sort Odd Even Sort Cocktail Sort Gnome Sort Selection Sort Min Max Sort Insertion Sort Shell Sort Heap Sort Comb Sort Merge Sort Quick Sort Do-Re-Mi-Fa Sort

	Random32	Random32R	Random15	Random15R	Forward	Reverse	Constant	Med Killer
qsort	`12027.68`	`12012.08`	`6801.64`	`6739.24`	`3369.62`	`3572.42`	`280.80`	`20358.13`
QuickSort	`5881.24`	`5881.24`	`3837.62`	`3837.62`	`1170.01`	`1185.61`	`1575.61`	`6021.64`
QuickSort2	`5740.84`	`5725.24`	`3822.02`	`3853.22`	`1014.01`	`1029.61`	`1560.01`	`5834.44`
QuickSortP	`5522.44`	`5538.04`	`3603.62`	`3634.82`	`1014.01`	`1014.01`	`1357.21`	`5709.64`
DoReMiFaSort	`4648.83`	`4633.23`	`2730.02`	`2714.42`	`748.80`	`764.40`	`31.20`	`1950.01`