program tip

C의`free`가 해제 할 바이트 수를 사용하지 않는 이유는 무엇입니까?

radiobox 2020. 9. 24. 07:42
반응형

C의`free`가 해제 할 바이트 수를 사용하지 않는 이유는 무엇입니까?


그냥 확실하게합니다 : 알아요 않습니다 mallocfree일반적으로 OS에서 메모리 청크를 할당하고 응용 프로그램에 메모리의 작은 제비를 소포 자체 관리를 수행하고 할당 된 바이트의 수를 추적 C 라이브러리에서 구현된다 . 이 질문은 어떻게 free does free to free 가 아닙니다 .

오히려 free애초에 이런 식으로 만들어진 이유를 알고 싶습니다 . 저수준 언어이기 때문에 어떤 메모리가 할당되었는지뿐만 아니라 얼마나 많은 메모리가 할당되었는지를 C 프로그래머에게 요청하는 것이 합리적이라고 생각합니다 (사실 저는 일반적으로 바이트 수를 추적합니다. 어쨌든 malloced). 또한 명시 적으로 바이트 free수를 제공하여 일부 성능 최적화를 허용 수 있습니다. 예를 들어 서로 다른 할당 크기에 대해 별도의 풀을 가진 할당자는 입력 인수를보고 해제 할 풀을 결정할 수 있습니다. 전체적으로 공간 오버 헤드가 적습니다.

그래서, 짧은에, 왜했다 malloc그리고 free그들이 내부적으로 할당 된 바이트 수를 추적하는 데 필요한하고 있다는 등의 생성? 그것은 단지 역사적인 사고입니까?

약간의 수정 : 몇몇 사람들이 "할당 한 금액과 다른 금액을 확보하면 어떨까요?"와 같은 포인트를 제공했습니다. 내가 상상 한 API는 할당 된 바이트 수를 정확히 해제하기 위해 간단하게 필요할 수 있습니다. 어느 정도 해방하는 것은 단순히 UB 또는 구현 정의 일 수 있습니다. 그래도 다른 가능성에 대한 논의를 중단하고 싶지는 않습니다.


하나의 인수 free(void *)(Unix V7에 도입 됨)는 mfree(void *, size_t)여기서 언급하지 않은 이전의 두 인수에 비해 또 다른 주요 이점이 있습니다. 하나의 인수 는 힙 메모리와 함께 작동하는 다른free 모든 API를 극적으로 단순화 합니다. 예를 들어, 메모리 블록의 크기가 필요한 경우 어떻게 든 하나 (포인터) 대신 두 개의 값 (포인터 + 크기)을 반환해야하며 C는 다중 값 반환을 단일 값 반환보다 훨씬 더 번거롭게 만듭니다. 대신 우리는 또는 다른 것을 써야 할 것 입니다 . (오늘날 두 번째 옵션은 매우 매력적으로 보입니다. NUL로 끝나는 문자열이 "컴퓨팅 역사상 가장 치명적인 설계 버그" 라는 것을 알고 있기 때문 입니다.freestrdupchar *strdup(char *)char *strdup(char *, size_t *)struct CharPWithSize { char *val; size_t size}; CharPWithSize strdup(char *), 그러나 그것은 뒤늦은 말입니다. 70 년대에 C의 문자열을 단순하게 처리하는 능력 char *은 실제로 Pascal 및 Algol과 같은 경쟁사에 비해 확실한 이점으로 간주되었습니다 .) 또한 strdup이 문제로 인해 고통을받는 것이 아니라 모든 시스템 또는 사용자 정의에 영향을 미칩니다. 힙 메모리를 할당하는 함수.

초기 유닉스 디자이너들은 매우 영리한 사람들이었고 기본적으로 free더 나은 이유는 여러 가지 가 있습니다 mfree. 질문에 대한 대답은 그들이 이것을 알아 채고 그에 따라 시스템을 설계했기 때문이라고 생각합니다. 그들이 그 결정을 내리는 순간 그들의 머릿속에서 무슨 일이 일어나고 있었는지에 대한 직접적인 기록을 찾을 수 있을지 의심 스럽습니다. 그러나 우리는 상상할 수 있습니다.

두 인수를 사용하여 V6 Unix에서 실행하기 위해 C로 애플리케이션을 작성한다고 가정합니다 mfree. 지금까지 잘 관리했지만, 프로그램 이 더 야심 차게 되고 힙 할당 변수를 점점 더 많이 사용해야 함에 따라 이러한 포인터 크기를 추적하는 것이 점점 더 번거로워지고 있습니다 . 그러나 다음과 같은 훌륭한 아이디어가 있습니다. size_t항상 이러한 s 주위를 복사하는 대신 할당 된 메모리 내부에 크기를 직접 숨기는 몇 가지 유틸리티 함수를 작성할 수 있습니다.

void *my_alloc(size_t size) {
    void *block = malloc(sizeof(size) + size);
    *(size_t *)block = size;
    return (void *) ((size_t *)block + 1);
}
void my_free(void *block) {
    block = (size_t *)block - 1;
    mfree(block, *(size_t *)block);
}

그리고 이러한 새로운 함수를 사용하여 더 많은 코드를 작성할수록 더 멋지게 보입니다. 코드를 더 쉽게 작성할 수있을 뿐만 아니라 코드를 더 빠르게 만듭니다. 두 가지가 자주 함께 사용되지 않습니다! 이들을 size_t사방에 전달하기 전에는 복사를위한 CPU 오버 헤드를 추가했고 레지스터를 더 자주 (특히 추가 함수 인수의 경우) 유출해야하고 메모리 낭비 (중첩 된 함수 호출이 종종 발생하기 때문에) size_t다른 스택 프레임에 저장되는 여러 사본 ). 새 시스템에서는 여전히 메모리를 사용하여size_t,하지만 한 번만, 어디에도 복사되지 않습니다. 이것은 작은 효율성처럼 보일 수 있지만 256KiB의 RAM을 갖춘 고급 시스템에 대해 이야기하고 있음을 명심하십시오.

이것은 당신을 행복하게합니다! 그래서 당신은 다음 유닉스 릴리스에서 작업하는 수염 난 남자들과 멋진 트릭을 공유하지만, 그들을 행복하게 만들지 않고 슬프게 만듭니다. 보시다시피, 그들은과 같은 새로운 유틸리티 함수를 추가하는 과정에 strdup있었고 멋진 트릭을 사용하는 사람들이 새로운 함수를 사용할 수 없다는 것을 알고 있습니다. 새로운 함수는 모두 성가신 포인터 + 크기를 사용하기 때문입니다. API. 그리고 그것은 당신도 슬프게 strdup(char *)합니다. 시스템 버전을 사용하는 대신 당신이 작성하는 모든 프로그램에서 좋은 기능을 직접 다시 작성해야한다는 것을 깨닫기 때문 입니다.

하지만 기다려! 이것은 1977 년이고, 이전 버전과의 호환성은 앞으로 5 년 동안 발명되지 않을 것입니다! 게다가, 아무도 심각한 실제로 사용하지 않고 자사의 오프 색상 이름이 알려지지 않은 "유닉스"일을. K & R의 초판은 현재 출판사로 향하고 있지만 문제가되지 않습니다. 첫 페이지에 "C는 문자열과 같은 복합 객체를 직접 처리하는 작업을 제공하지 않습니다. 힙이 없습니다. ... ". 역사의이 시점에서, string.h그리고 malloc벤더 확장은 (!). 따라서 Bearded Man # 1을 제안합니다. 원하는대로 변경할 수 있습니다. 왜 당신의 까다로운 할당자를 공식 할당 자로 선언하지 않습니까?

며칠 후 Bearded Man # 2는 새 API를보고 이전보다 낫다고 말합니다.하지만 여전히 크기를 저장하는 할당 당 전체 단어를 소비하고 있습니다. 그는 이것을 신성 모독의 다음으로 본다. 다른 사람들은 그가 미친 것처럼 쳐다 봅니다. 그날 밤 그는 늦게 머물면서 크기를 전혀 저장하지 않는 새로운 할당자를 발명하지만 대신 포인터 값에 대해 흑 마법 비트 시프트를 수행하여 즉시 유추하고 새 API를 제자리에 유지하면서 교체합니다. 새로운 API는 아무도 스위치를 알아 차리지 못하지만 다음날 아침 컴파일러가 RAM을 10 % 적게 사용한다는 것을 알아 차립니다.

그리고 이제 모두가 행복합니다. 작성하기 쉽고 빠른 코드를 얻고, Bearded Man # 1은 strdup사람들이 실제로 사용할 수 있는 멋진 간단한 코드를 작성 하고 Bearded Man # 2는 자신이 약간의 이익을 얻었음을 확신합니다. -quines엉망으로 만드는 것으로 돌아갑니다 . 그것을 발송하십시오!

아니면 적어도 그렇게되었을 수도 있습니다.


" freeC에서 해제 할 바이트 수를 사용하지 않는 이유는 무엇 입니까?"

이 없기 때문에 그것을 위해 필요, 그것은 확실히 감지하지 것이다 어쨌든.

무언가를 할당 할 때 할당 할 바이트 수를 시스템에 알려야합니다 (분명한 이유 때문에).

However, when you have already allocated your object, the size of the memory region you get back is now determined. It's implicit. It's one contiguous block of memory. You can't deallocate part of it (let's forget realloc(), that's not what it's doing anyway), you can only deallocate the entire thing. You can't "deallocate X bytes" either -- you either free the memory block you got from malloc() or you don't.

And now, if you want to free it, you can just tell the memory manager system: "here's this pointer, free() the block it is pointing to." - and the memory manager will know how to do that, either because it implicitly knows the size, or because it might not even need the size.

For example, most typical implementations of malloc() maintain a linked list of pointers to free and allocated memory blocks. If you pass a pointer to free(), it will just search for that pointer in the "allocated" list, un-link the corresponding node and attach it to the "free" list. It didn't even need the region size. It will only need that information when it potentially attempts to re-use the block in question.


C may not be as "abstract" as C++, but it's still intended to be an abstraction over assembly. To that end, the lowest-level details are taken out of the equation. This prevents you from having to furtle about with alignment and padding, for the most part, which would make all your C programs non-portable.

In short, this is the entire point of writing an abstraction.


Actually, in the ancient Unix kernel memory allocator, mfree() took a size argument. malloc() and mfree() kept two arrays (one for core memory, another one for swap) that contained information on free block addresses and sizes.

There was no userspace allocator until Unix V6 (programs would just use sbrk()). In Unix V6, iolib included an allocator with alloc(size) and a free() call which did not take a size argument. Each memory block was preceded by its size and a pointer to the next block. The pointer was only used on free blocks, when walking the free list, and was reused as block memory on in-use blocks.

In Unix 32V and in Unix V7, this was substituted by a new malloc() and free() implementation, where free() did not take a size argument. The implementation was a circular list, each chunk was preceded by a word that contained a pointer to the next chunk, and a "busy" (allocated) bit. So, malloc()/free() didn't even keep track of an explicit size.


Why does free in C not take the number of bytes to be freed?

Because it doesn't need to. The information is already available in the internal management performed by malloc/free.

Here are two considerations (that may or may not have contributed to this decision):

  • Why would you expect a function to receive a parameter it doesn't need?

    (this would complicate virtually all client code relying on dynamic memory, and add completely unnecessary redundancy to your application). Keeping track of pointer allocation is already a dificult problem. Keeping track of memory allocations along with associated sizes would increase the complexity of client code unnecessarily.

  • What would the altered free function do, in these cases?

    void * p = malloc(20);
    free(p, 25); // (1) wrong size provided by client code
    free(NULL, 10); // (2) generic argument mismatch
    

    Would it not free (cause a memory leak?)? Ignore the second parameter? Stop the application by calling exit? Implementing this would add extra failure points in your application, for a feature you probably don't need (and if you need it, see my last point, below - "implementing solution at application level").

Rather, I want to know why free was made this way in the first place.

Because this is the "proper" way to do it. An API should require the arguments it needs to perform it's operation, and no more than that.

It also occurs to me that explicitly giving the number of bytes to free might allow for some performance optimisations, e.g. an allocator that has separate pools for different allocation sizes would be able to determine which pool to free from just by looking at the input arguments, and there would be less space overhead overall.

The proper ways to implement that, are:

  • (at the system level) within the implementation of malloc - there is nothing stopping the library implementer from writing malloc to use various strategies internally, based on received size.

  • (at application level) by wrapping malloc and free within your own APIs, and using those instead (everywhere in your application that you may need).


Five reasons spring to mind:

  1. It's convenient. It removes a whole load of overhead from the programmer and avoids a class of extremely difficult to track errors.

  2. It opens up the possibility of releasing part of a block. But since memory managers usually want to have tracking information it isn't clear what this would mean?

  3. Lightness Races In Orbit is spot on about padding and alignment. The nature of memory management means that the actual size allocated is quite possibly different from the size you asked for. This means that were free to require a size as well as a location malloc would have to be changed to return the actual size allocated as well.

  4. It's not clear that there is any actual benefit to passing in the size, anyway. A typical memory manager has 4-16 bytes of header for each chunk of memory, which includes the size. This chunk header can be common for allocated and unallocated memory and when adjacent chunks come free they can be collapsed together. If you're making the caller store the free memory you can free up probably 4 bytes per chunk by not having a separate size field in allocated memory but that size field is probably not gained anyway since the caller needs to store it somewhere. But now that information is scattered in memory rather than being predictably located in the header chunk which is likely to be less operationally efficient anyway.

  5. Even if it was more efficient it's radically unlikely your program is spending a large amount of time freeing memory anyway so the benefit would be tiny.

Incidentally, your idea about separate allocators for different size items is easily implemented without this information (you can use the address to determine where the allocation occurred). This is routinely done in C++.

Added later

Another answer, rather ridiculously, has brought up std::allocator as proof that free could work this way but, in fact, it serves as a good example of why free doesn't work this way. There are two key differences between what malloc/free do and what std::allocator does. Firstly, malloc and free are user facing - they're designed for the general programmers to work with - whereas std::allocator is designed to provide specialist memory allocation to the standard library. This provides a nice example of when the first of my points doesn't, or wouldn't, matter. Since it's a library, the difficulties of handling the complexities of tracking size are hidden from the user anyway.

Secondly, std::allocator always works with the same size item this means that it is possible for it to use the originally passed number of elements to determine how much of free. Why this differs from free itself is illustrative. In std::allocator the items to be allocated are always of the same, known, size and always the same kind of item so they always have the same kind of alignment requirements. This means that the allocator could be specialised to simply allocate an array of these items at the start and dole them out as needed. You couldn't do this with free because there is no way to guarantee that the best size to return is the size asked for, instead it is much more efficient to sometimes return larger blocks than the caller asks for* and thus either the user or the manager needs to track the exact size actually granted. Passing these kinds of implementation details onto the user is a needless headache that gives no benefit to the caller.

-* If anyone is still having difficultly understanding this point, consider this: a typical memory allocator adds a small amount of tracking information to the start of a memory block and then returns a pointer offset from this. Information stored here typically includes pointers to the next free block, for example. Let's suppose that header is a mere 4 bytes long (which is actually smaller than most real libraries), and doesn't include the size, then imagine we have a 20 byte free block when the user asks for a 16 byte block, a naive system would return the 16byte block but then leave a 4byte fragment that could never, ever be used wasting time every time malloc gets called. If instead the manager simply returns the 20 byte block then it saves these messy fragments from building up and is able to more cleanly allocate the available memory. But if the system is to correctly do this without tracking the size itself we then require the user to track - for every, single allocation - the amount of memory actually allocated if it is to pass it back for free. The same argument applies to padding for types/allocations that don't match the desired boundaries. Thus, at most, requiring free to take a size is either (a) completely useless since the memory allocator can't rely on the passed size to match the actually allocated size or (b) pointlessly requires the user to do work tracking the real size that would be easily handled by any sensible memory manager.


I'm only posting this as an answer not because it's the one you're hoping for, but because I believe it's the only plausibly correct one:

It was probably deemed convenient originally, and it could not be improved thereafter.
There is likely no convincing reason for it. (But I'll happily delete this if shown it's incorrect.)

There would be benefits if it was possible: you could allocate a single large piece of memory whose size you knew beforehand, then free a little bit at a time -- as opposed to repeatedly allocating and freeing small chunks of memory. Currently tasks like this are not possible.


To the many (many1!) of you who think passing the size is so ridiculous:

May I refer you to C++'s design decision for the std::allocator<T>::deallocate method?

void deallocate(pointer p, size_type n);

All n T objects in the area pointed to by p shall be destroyed prior to this call.
n shall match the value passed to allocate to obtain this memory.

I think you'll have a rather "interesting" time analyzing this design decision.


As for operator delete, it turns out that the 2013 N3778 proposal ("C++ Sized Deallocation") is intended to fix that, too.


1Just look at the comments under the original question to see how many people made hasty assertions such as "the asked for size is completely useless for the free call" to justify the lack of the size parameter.


malloc and free go hand in hand, with each "malloc" being matched by one "free". Thus it makes total sense that the "free" matching a previous "malloc" should simply free up the amount of memory allocated by that malloc - this is the majority use case that would make sense in 99% of cases. Imagine all the memory errors if all uses of malloc/free by all programmers around the world ever, would need the programmer to keep track of the amount allocated in malloc, and then remember to free the same. The scenario you talk about should really be using multiple mallocs/frees in some kind of memory management implementation.


I would suggest that it is because it is very convenient not to have to manually track size information in this way (in some cases) and also less prone to programmer error.

Additionally, realloc would need this bookkeeping information, which I expect contains more than just the allocation size. i.e. it allows the mechanism by which it works to be implementation defined.

You could write your own allocator that worked somewhat in the way you suggest though and it is often done in c++ for pool allocators in a kind of similar way for specific cases (with potentially massive performance gains) though this is generally implemented in terms of operator new for allocating pool blocks.


I don't see how an allocator would work that does not track the size of its allocations. If it didn't do this, how would it know which memory is available to satisfy a future malloc request? It has to at least store some sort of data structure containing addresses and lengths, to indicate where the available memory blocks are. (And of course, storing a list of free spaces is equivalent to storing a list of allocated spaces).


Well, the only thing you need is a pointer that you'll use to free up the memory you previously allocated. The amount of bytes is something managed by the operating system so you don't have to worry about it. It wouldn't be necessary to get the number of bytes allocated returned by free(). I suggest you a manual way to count the number of bytes/positions allocated by a running program:

If you work in Linux and you want to know the amount of bytes/positions malloc has allocated, you can make a simple program that uses malloc once or n times and prints out the pointers you get. In addition, you must make the program sleep for a few seconds (enough for you to do the following). After that, run that program, look for its PID, write cd /proc/process_PID and just type "cat maps". The output will show you, in one specific line, both the beginning and the final memory addresses of the heap memory region (the one in which you are allocating memory dinamically).If you print out the pointers to these memory regions being allocated, you can guess how much memory you have allocated.

Hope it helps!


Why should it? malloc() and free() are intentionally very simple memory management primitives, and higher-level memory management in C is largely up to the developer. T

Moreover realloc() does that already - if you reduce the allocation in realloc() is it will not move the data, and the pointer returned will be the the same as the original.

It is generally true of the entire standard library that it is composed of simple primitives from which you can build more complex functions to suit your application needs. So the answer to any question of the form "why does the standard library not do X" is because it cannot do everything a programmer might think of (that's what programmers are for), so it chooses to do very little - build your own or use third-party libraries. If you want a more extensive standard library - including more flexible memory management, then C++ may be the answer.

You tagged the question C++ as well as C, and if C++ is what you are using, then you should hardly be using malloc/free in any case - apart from new/delete, STL container classes manage memory automatically, and in a manner likely to be specifically appropriate to the nature of the various containers.

참고URL : https://stackoverflow.com/questions/24203940/why-does-free-in-c-not-take-the-number-of-bytes-to-be-freed

반응형