别名 (计算)
此条目目前正依照其他维基百科上的内容进行翻译。 (2023年4月15日) |
别名(Aliasing)是指内存中的一个数据位置可以通过程序中的多个名称来访问。通过某一个名称修改数据,其他别名关联的值也会改变,这是程序员可能不会预期到的。别名的存在使得程式的理解、分析及优化程序变得困难。别名分析可以分析处理程序中有关别名的信息。
例子
编辑缓冲区溢出
编辑大部分C语言的实现都不会有阵列索引的边界检查。因此,可以利用此一漏洞,写入在阵列范围外的资料(缓冲区溢出),根据C语言的标准,这是未定义行为,但在大部分没有阵列索引边界检查的C语言中,会出现上述的别名效果,用某一个名称更改资料,而对应别名的数值随之变化。
若阵列是在呼叫堆叠中产生,而有变数恰好就在阵列位置的前后,写入阵列索引范围外的元素,可能就会改到该变数。例如,假设有二个元素的int阵列(其名称为arr),后面是一个int变数(名称是i),若arr[2](阵列的第三个元素)位置和i相同,这二个变数就互为别名。
# include <stdio.h>
int main()
{
int arr[2] = { 1, 2 };
int i=10;
/* Write beyond the end of arr. Undefined behaviour in standard C, will write to i in some implementations. */
arr[2] = 20;
printf("element 0: %d \t", arr[0]); // outputs 1
printf("element 1: %d \t", arr[1]); // outputs 2
printf("element 2: %d \t", arr[2]); // outputs 20, if aliasing occurred
printf("i: %d \t\t", i); // might also output 20, not 10, because of aliasing, but the compiler might have i stored in a register and print 10
/* arr size is still 2. */
printf("arr size: %d \n", (sizeof(arr) / sizeof(int)));
}
在一些C语言的实现中,有可能会出现上述的结果,因为这些实现会为阵列安排一块连续的记忆体,而阵列元素就是用阵列位置再位移阵列索引值乘以阵列元素大小,再进行间接定址。C语言没有边界检查,因此阵列的存取可能会超过阵列范围。上述的别名效果其实属于未定义行为,有些实现方式会不会让堆叠中的变数紧邻阵列,例如,依其处理器的字长度有对齐功能等。C语言标准没有特别说明资料在记忆体中摆放的方式(ISO/IEC 9899:1999, section 6.2.6.1)。
若C语言编辑器在存取阵列范围以外的位置时,没有别名效果,这也是可以的。
别名指针
编辑另一种编程语言中会出现的别名,是指用不同的变数(例如指标)参考同一个位置的记忆体。例如XOR交换算法,其引数是二个指标,函式会假设二个指标指向不同的位置。若二个指标的位置相同(或互为别名),程式可能会出现错误。对于接受指标作为引数的函式来说,这是常见的问题,是否允许二个指标互为别名,需要明确的说明,特别是在会在指标指向记忆区块,进行复杂处理的函式。
Specified aliasing
编辑Controlled aliasing behaviour may be desirable in some cases (that is, aliasing behaviour that is specified, unlike that enabled by memory layout in C). It is common practice in Fortran. The Perl programming language specifies, in some constructs, aliasing behaviour, such as in foreach loops. This allows certain data structures to be modified directly with less code. For example,
my @array = (1, 2, 3);
foreach my $element (@array) {
# Increment $element, thus automatically
# modifying @array, since $element is ''aliased''
# to each of @array's elements in turn.
$element++;
}
print "@array \n";
will print out "2 3 4" as a result. If one wanted to bypass aliasing effects, one could copy the contents of the index variable into another and change the copy.
优化时冲突
编辑优化编译器在存在指针时往往对变量做出保守假设。如常量传播能否使用。代码重排序(code reordering)也受别名的影响,这可能会改善指令调度或允许更多的循环优化.
C语言的C99标准,提出了严格别名规则(strict aliasing rule)见section 6.5, paragraph 7。指出使用不同类型的指针访问同一内存位置是违规的。编译器因而可以假定不同类型的指针不会是别名,这可能带来性能的巨大提升。[1]一些著名项目,如Python 2违反了此规则。[2]Linux内核也解决了类似问题。[3] 使用gcc编译选项-fno-strict-aliasing
可关闭此规则。
- 对象的动态类型
- cv量化版本
- signed或unsigned版本
- 聚合类型(如struct、class)或union类型,包含此前所指的类型作为它的元素,或非静态数据成员(包括递归嵌套类型)
- 动态类型的基类型
- char或unsigned
硬件别名
编辑The term aliasing is also used to describe the situation where, due to either a hardware design choice or a hardware failure, one or more of the available address bits is not used in the memory selection process.[4] This may be a design decision if there are more address bits available than are necessary to support the installed memory device(s). In a failure, one or more address bits may be shorted together, or may be forced to ground (logic 0) or the supply voltage (logic 1).
- Example
For this example, we assume a memory design with 8 locations, requiring only 3 address lines (or bits) since 23 = 8). Address bits (named A2 through A0) are decoded to select unique memory locations as follows, in standard binary counter fashion:
A2 | A1 | A0 | Memory location |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
1 | 0 | 0 | 4 |
1 | 0 | 1 | 5 |
1 | 1 | 0 | 6 |
1 | 1 | 1 | 7 |
In the table above, each of the 8 unique combinations of address bits selects a different memory location. However, if one address bit (say A2) were to be shorted to ground, the table would be modified as follows:
A2 | A1 | A0 | Memory location |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 2 |
0 | 1 | 1 | 3 |
In this case, with A2 always being zero, the first four memory locations are duplicated and appear again as the second four. Memory locations 4 through 7 have become inaccessible.
If this change occurred to a different address bit, the decoding results would be different, but in general the effect would be the same: the loss of a single address bit cuts the available memory space in half, with resulting duplication (aliasing) of the remaining space.
参见
编辑参考文献
编辑- ^ Mike Acton. Understanding Strict Aliasing. 2006-06-01 [2017-11-20]. (原始内容存档于2013-05-08).
- ^ Neil Schemenauer. ANSI strict aliasing and Python. 2003-07-17 [2017-11-20]. (原始内容存档于2020-06-05).
- ^ Linus Torvalds. Re: Invalid compilation without -fno-strict-aliasing. 2003-02-26 [2017-11-20]. (原始内容存档于2020-11-12).
- ^ Michael Barr. Software Based Memory Testing. 2012-07-27 [2017-11-20]. (原始内容存档于2020-11-29).
外部链接
编辑- Understanding Strict Aliasing(页面存档备份,存于互联网档案馆) – article by Mike Acton
- Aliasing, pointer casts and gcc 3.3 (页面存档备份,存于互联网档案馆) – informational article on NetBSD mailing list
- Type-based alias analysis in C++ (页面存档备份,存于互联网档案馆) – Informational article on type-based alias analysis in C++
- Understand C/C++ Strict Aliasing (页面存档备份,存于互联网档案馆) – article on strict aliasing originally from the boost developer's wiki