sk_buff_head missing in newer linux kernels - linux-kernel

In older versions of linux kernel (e.g. 2.6.11) struct sk_buff contains a pointer to struct sk_buff_head (named list). The 'Understanding Linux Network Internals' book says this pointer is maintained as sk_buffs need to quickly look up the head of the skb list. However I could not find such a member in recent versions of kernel (3.2.1). Can anyone explain how the skb list management has changed in newer kernels?

This changed a long time ago, in 2.6.14 apprently. The kernel commit in question was:
commit 8728b834b226ffcf2c94a58530090e292af2a7bf
Author: David S. Miller <davem#davemloft.net>
Date: Tue Aug 9 19:25:21 2005
[NET]: Kill skb->list
Remove the "list" member of struct sk_buff, as it is entirely
redundant. All SKB list removal callers know which list the
SKB is on, so storing this in sk_buff does nothing other than
taking up some space.
Two tricky bits were SCTP, which I took care of, and two ATM
drivers which Francois Romieu <romieu#fr.zoreil.com> fixed
up.
Signed-off-by: David S. Miller <davem#davemloft.net>
Signed-off-by: Francois Romieu <romieu#fr.zoreil.com>

Related

Changes in new linux kernel 4.4.2 distributed with SLES 12 SP2 causing driver build failure

I am building my SLES 12 driver (block device driver) with 3.x kernel on SLES 12 SP2 which has kernel version 4.4.2 . Now I am facing problem with few things:
struct bvec_merge_data
is not available in kernel 4.3.0 onwards in include/linux/blkdev.h
struct bvec_merge_data {
struct block_device *bi_bdev;
sector_t bi_sector;
unsigned bi_size;
unsigned long bi_rw;
};
from 4.2.8 onward this funtion pointer is not present. What might be the alternative method is provided in 4.3 or higher versions.
typedef int (merge_bvec_fn) (struct request_queue *, struct
bvec_merge_data *,
struct bio_vec *);
In the request_queue structure the below structure elements are removed form 4.2.8 where these elemments are handled
struct request_queue {
unprep_rq_fn *unprep_rq_fn;
merge_bvec_fn *merge_bvec_fn;
Any idea where can I look for these changes and and any alternative for those?
Best place for such answers is git log of kernel source. Supplying -S switch will search within the diff content. Supplying -G will do same but with regular expressions.
In this case running git log -S "bvec_merge_data" shows information on changes related to this struct and, by association, merge_bvec_fn method. Here's snapshot of the top message which talks about complete removal of struct bvec_merge_data:
commit 8ae126660fddbeebb9251a174e6fa45b6ad8f932 Author: Kent
Overstreet Date: Mon Apr 27 23:48:34
2015 -0700
block: kill merge_bvec_fn() completely
As generic_make_request() is now able to handle arbitrarily sized bios,
it's no longer necessary for each individual block driver to define its
own ->merge_bvec_fn() callback. Remove every invocation completely.
Other commit message preceding this one show the build up to it, which can be a good step-by-step explanation to your question.
Hope it helps :)

what is the address family unknown1 in winsock2.h?

In the header file winsock2.h, I found an address family called unknown1.
What does this address family represents and what is it used for ??
Here is the source code of the header file winsock2.h, and here is the code line that contain the constant of that address family:
#define AF_UNKNOWN1 20
Your copy of winsock2.h is strange, perhaps you left off the comment on purpose. I keep old versions of SDKs around, they are an interesting archeological record of Windows development. I can trace this one back to the WinNT version 4 SDK, released in 1996 and the first SDK version that supported Winsock v2. It extended the address families first supported in NT 3.1 and Winsock v1.1, copy-pasting all of the added ones:
#define AF_VOICEVIEW 18 /* VoiceView */
#define AF_FIREFOX 19 /* Protocols from Firefox */
#define AF_UNKNOWN1 20 /* Somebody is using this! */
#define AF_BAN 21 /* Banyan */
#define AF_ATM 22 /* Native ATM Services */
#define AF_INET6 23 /* Internetwork Version 6 */
Still looks the same way today. Obviously the comment is relevant, Somebody is using this! should have the emphasis on Somebody. It is bracketed by products of companies that had pretty successful products back in the middle 90s, big enough to have a working relationship with Microsoft and get their product verified and supported by Winsock 2 and WinNT4 (Firefox was a company, not the browser btw).
So a somewhat plausible scenario is that a conflict was detect by a tester, otherwise having any idea how dirty his machine was, and filed a bug report. If Microsoft didn't know back in 1996 then, well, nobody knows. Time has not been kind to these companies and their products, the dominance of TCP/IP and the Dot-com bubble bust killed about all of them. Surely the same happened to Somebody Inc :)
That actually is pretty self-describing: it is everything else which is not defined otherwise. E.g., AF_UNKNOWN1 is an address family which is none of the other, defined address families; PF_UNKNOWN1 is such a protocol family. For the 1 postfix I didn't find now quickly pointers, my assumption is that it has been introduced to avoid conflicts with possibly already existing _UNKNOWN definitions.

What causes this error: "address already known to kernel for another [busy] synchronizer type"?

I have a customer who is getting their system log flooded with thousands of copies of this message:
Jul 25 11:21:33 athayer-mbp13 kernel[0]: PSYNCH: pid[52893]: address already known to kernel for another [busy] synchronizer type
The culprit is my app, but I can’t reproduce the problem and don’t have much of a clue to its cause. My app does disk searching, and this error happens about 15 hours into the life of the process. There is no excessive memory usage or file descriptor leakage. The app continues to operate normally, it’s just that these messages cause the system log to blow up to gigabyte proportions and fill up the boot disk.
I found the Darwin kernel code where the message is printed, but it’s only a clue, it doesn’t show the smoking gun:
http://opensource.apple.com//source/xnu/xnu-1699.32.7/bsd/kern/pthread_support.c
FAILEDUSERTEST("address already known to kernel for another (busy) synchronizer type\n”);
It’s in this function:
/* find kernel waitqueue, if not present create one. Grants a reference */
int
ksyn_wqfind(user_addr_t mutex, uint32_t mgen, uint32_t ugen, uint32_t rw_wc, uint64_t tid, int flags, int wqtype, ksyn_wait_queue_t * kwqp)
Can anyone provide any insight into what’s going on?
Here’s the profile for the machine:
Model Name: MacBook Pro
Model Identifier: MacBookPro12,1
Processor Name: Intel Core i5
Processor Speed: 2.7 GHz
Number of Processors: 1
Total Number of Cores: 2
L2 Cache (per Core): 256 KB
L3 Cache: 3 MB
Memory: 8 GB
Boot ROM Version: MBP121.0167.B16
SMC Version (system): 2.28f7
Hardware UUID: 9205D058-90BF-541E-8E61-E75259ABC11F
System Software Overview:
System Version: OS X 10.11.4 (15E65)
Kernel Version: Darwin 15.4.0
Boot Volume: Macintosh HD
Boot Mode: Normal
Computer Name: athayer-mbp13
User Name: System Administrator (root)
Secure Virtual Memory: Enabled
system_integrity: integrity_enabled
Time since boot: 9 days 18:55
Possible Explanation
It's possible that you're being affected by an old kernel bug. If a pthread condition variable (the main component of a standard pthread_mutex family object) is allocated, but never waited on, there is a situation in which its object is never removed from a pthreads-internal registry on OSX.
If that happens, and if another mutex is later allocated that happens to end up in the same space in memory, and if that mutex is waited on, this error can occur, since the new mutex's ID will not match the one already present in its space. This is distinct from a reallocation issue where garbled/meaningless info is found instead of a valid ID.
Workaround
The workaround is to ensure that you are calling a a wait function on all mutexes/condvars you create. Even a nanosecond wait will trigger "correct" destruction when it completes on a no-longer-used mutex. An example of the fix by the Chromium devs is linked below.
For example, you could wait one nanosecond/tick on a lock thus:
struct timespec time { .tv_sec = 0, .tv_nsec = 1 };
pthread_cond_timedwait_relative_np(
&some_condition_handle,
&some_lock_handle,
time
);
Confounding Factors
The kernel bug may not be the real issue. There are a lot of confounding factors here:
The kernel source hasn't been published for 10.10 or 10.11, so the code being called that generates that error may not be the code that you found online.
As a result of that, the kernel bug I mentioned may not still exist, or may not be reachable in the same way.
The error line you published has parens (()) around the word "busy", but the source you found has square brackets ([]). The places in code that print out the two different messages are distinct from each other, so the problem lines might not be the ones you pointed out in your question.
Relevant Links
Article by the first (only?) person who has diagnosed this issue: http://rayne3d.com/blog/02-27-2014-rayne-weekly-devblog-4
The problem gets exhibited in the pthread source (or it was, in pthread 105.1.4), visible at this link (search in the page for 13782056): https://opensource.apple.com/source/libpthread/libpthread-105.1.4/src/pthread_cond.c
An example fix like the workaround listed above was made by the Chromium team when they were affected by a similar (the same?) issue: https://codereview.chromium.org/1323293005
The original Apple Developer Forum link appears to be defunct, though I might just be unable to access it: https://devforums.apple.com/thread/220316?tstart=0

Explanation of struct ieee80211_local in Linux kernel

Can anybody explain me the ieee80211_local structure and its members?
The structure is defined in /net/mac80211/ieee80211_i.h of the Linux source code, somewhere near line no. 930, it may vary with different kernel versions.
According to a presentation by Daniel Camps Mur struct ieee80211_local "…contains information about the real hardware, and is created when the interface is first added."
In another set of slides by Johannes Berg it is stated that the struct "represents a wireless device." On the same slide, you can find a statement about the element ieee80211_hw "…is the part of ieee80211_local that is visible to drivers."
Interestingly, the struct is not mentioned in the 802.11 subsystems documentation by Johannes Berg.
Looking at a source code cross reference of the Linux kernel, you can see that the
ieee80211_local struct is never used outside of the mac80211 part. Therefore, I think that it is indeed an internally used representation of a wireless device from mac80211's point of view.
In contrast, you can see that the ieee80211_hw element is used in both mac80211 and various wireless device drivers which underlines that it is used to communicate between mac80211 and drivers.
BTW, the struct was introduced with the very first commit of ieee80211_i.h by Jiri Benc in 2007.
If you need to know more details about the struct and its members, it looks like you want to do some development at the mac80211 code. I would suggest to get in touch with the active developers. The Linux wireless mailing list may be a good starting point for that.

feature request: an atomicAdd() function included in gwan.h

In the G-WAN KV options, KV_INCR_KEY will use the 1st field as the primary key.
That means there is a function which increments atomically already built in the G-WAN core to make this primary index work.
It would be good to make this function opened to be used by servlets, i.e. included in gwan.h.
By doing so, ANSI C newbies like me could benefit from it.
There was ample discussion about this on the old G-WAN forum, and people were invited to share their experiences with atomic operations in order to build a rich list of documented functions, platform by platform.
Atomic operations are not portable because they address the CPU directly. It means that the code for Intel x86 (32-bit) and Intel AMD64 (64-bit) is different. Each platform (ARM, Power7, Cell, Motorola, etc.) has its own atomic instruction sets.
Such a list was not published in the gwan.h file so far because basic operations are easy to find (the GCC compiler offers several atomic intrinsics as C extensions) but more sophisticated operations are less obvious (needs asm skills) and people will build them as they need - for very specific uses in their code.
Software Engineering is always a balance between what can be made available at the lowest possible cost to entry (like the G-WAN KV store, which uses a small number of functions) and how it actually works (which is far less simple to follow).
So, beyond the obvious (incr/decr, set/get), to learn more about atomic operations, use Google, find CPU instruction sets manuals, and arm yourself with courage!
Thanks for Gil's helpful guidance.
Now, I can do it by myself.
I change the code in persistence.c, as below:
firstly, i changed the definition of val in data to volatile.
//data[0]->val++;
//xbuf_xcat(reply, "Value: %d", data[0]->val);
int new_count, loops=50000000, time1, time2, time;
time1=getus();
for(int i; i<loops; i++){
new_count = __sync_add_and_fetch(&data[0]->val, 1);
}
time2=getus();
time=loops/(time2-time1);
time=time*1000;
xbuf_xcat(reply, "Value: %d, time: %d incr_ops/msec", new_count, time);
I got 52,000 incr_operations/msec with my old E2180 CPU.
So, with GCC compiler I can do it by myself.
thanks again.

Resources