1. 25 Oct, 2018 2 commits
    • Wei Wang's avatar
      virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT · 86a55978
      Wei Wang authored
      Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_HINT feature indicates the
      support of reporting hints of guest free pages to host via virtio-balloon.
      Currenlty, only free page blocks of MAX_ORDER - 1 are reported. They are
      obtained one by one from the mm free list via the regular allocation
      function.
      
      Host requests the guest to report free page hints by sending a new cmd id
      to the guest via the free_page_report_cmd_id configuration register. When
      the guest starts to report, it first sends a start cmd to host via the
      free page vq, which acks to host the cmd id received. When the guest
      finishes reporting free pages, a stop cmd is sent to host via the vq.
      Host may also send a stop cmd id to the guest to stop the reporting.
      
      VIRTIO_BALLOON_CMD_ID_STOP: Host sends this cmd to stop the guest
      reporting.
      VIRTIO_BALLOON_CMD_ID_DONE: Host sends this cmd to tell the guest that
      the reported pages are ready to be freed.
      
      Why does the guest free the reported pages when host tells it is ready to
      free?
      This is because freeing pages appears to be expensive for live migration.
      free_pages() dirties memory very quickly and makes the live migraion not
      converge in some cases. So it is good to delay the free_page operation
      when the migration is done, and host sends a command to guest about that.
      
      Why do we need the new VIRTIO_BALLOON_CMD_ID_DONE, instead of reusing
      VIRTIO_BALLOON_CMD_ID_STOP?
      This is because live migration is usually done in several rounds. At the
      end of each round, host needs to send a VIRTIO_BALLOON_CMD_ID_STOP cmd to
      the guest to stop (or say pause) the reporting. The guest resumes the
      reporting when it receives a new command id at the beginning of the next
      round. So we need a new cmd id to distinguish between "stop reporting" and
      "ready to free the reported pages".
      
      TODO:
      - Add a batch page allocation API to amortize the allocation overhead.
      Signed-off-by: default avatarWei Wang <wei.w.wang@intel.com>
      Signed-off-by: default avatarLiang Li <liang.z.li@intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      86a55978
    • Lénaïc Huard's avatar
      kvm_config: add CONFIG_VIRTIO_MENU · d7b31359
      Lénaïc Huard authored
      Make sure that make kvmconfig enables all the virtio drivers even if it is
      preceded by a make allnoconfig.
      Signed-off-by: default avatarLénaïc Huard <lenaic@lhuard.fr>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      d7b31359
  2. 22 Oct, 2018 8 commits
  3. 21 Oct, 2018 3 commits
  4. 20 Oct, 2018 11 commits
  5. 19 Oct, 2018 13 commits
  6. 18 Oct, 2018 3 commits
    • Stefano Brivio's avatar
      ip6_tunnel: Fix encapsulation layout · d4d576f5
      Stefano Brivio authored
      Commit 058214a4 ("ip6_tun: Add infrastructure for doing
      encapsulation") added the ip6_tnl_encap() call in ip6_tnl_xmit(), before
      the call to ipv6_push_frag_opts() to append the IPv6 Tunnel Encapsulation
      Limit option (option 4, RFC 2473, par. 5.1) to the outer IPv6 header.
      
      As long as the option didn't actually end up in generated packets, this
      wasn't an issue. Then commit 89a23c8b ("ip6_tunnel: Fix missing tunnel
      encapsulation limit option") fixed sending of this option, and the
      resulting layout, e.g. for FoU, is:
      
      .-------------------.------------.----------.-------------------.----- - -
      | Outer IPv6 Header | UDP header | Option 4 | Inner IPv6 Header | Payload
      '-------------------'------------'----------'-------------------'----- - -
      
      Needless to say, FoU and GUE (at least) won't work over IPv6. The option
      is appended by default, and I couldn't find a way to disable it with the
      current iproute2.
      
      Turn this into a more reasonable:
      
      .-------------------.----------.------------.-------------------.----- - -
      | Outer IPv6 Header | Option 4 | UDP header | Inner IPv6 Header | Payload
      '-------------------'----------'------------'-------------------'----- - -
      
      With this, and with 84dad559 ("udp6: fix encap return code for
      resubmitting"), FoU and GUE work again over IPv6.
      
      Fixes: 058214a4 ("ip6_tun: Add infrastructure for doing encapsulation")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4d576f5
    • Jon Maloy's avatar
      tipc: fix info leak from kernel tipc_event · b06f9d9f
      Jon Maloy authored
      We initialize a struct tipc_event allocated on the kernel stack to
      zero to avert info leak to user space.
      
      Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b06f9d9f
    • Wenwen Wang's avatar
      net: socket: fix a missing-check bug · b6168562
      Wenwen Wang authored
      In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
      statement to see whether it is necessary to pre-process the ethtool
      structure, because, as mentioned in the comment, the structure
      ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
      is allocated through compat_alloc_user_space(). One thing to note here is
      that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
      partially determined by 'rule_cnt', which is actually acquired from the
      user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
      get_user(). After 'rxnfc' is allocated, the data in the original user-space
      buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
      including the 'rule_cnt' field. However, after this copy, no check is
      re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
      race to change the value in the 'compat_rxnfc->rule_cnt' between these two
      copies. Through this way, the attacker can bypass the previous check on
      'rule_cnt' and inject malicious data. This can cause undefined behavior of
      the kernel and introduce potential security risk.
      
      This patch avoids the above issue via copying the value acquired by
      get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6168562