366 Commits

Author SHA1 Message Date
Chen Qun
d59e6f7dbe mirror: Wait only for in-flight operations
mirror_wait_for_free_in_flight_slot() just picks a random operation to
wait for. However, a MirrorOp is already in s->ops_in_flight when
mirror_co_read() waits for free slots, so if not enough slots are
immediately available, an operation can end up waiting for itself, or
two or more operations can wait for each other to complete, which
results in a hang.

Fix this by adding a flag to MirrorOp that tells us if the request is
already in flight (and therefore occupies slots that it will later
free), and picking only such operations for waiting.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1794692
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200326153628.4869-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-22 11:27:53 +08:00
Chen Qun
e9b39dadb6 qcow2: Fix qcow2_alloc_cluster_abort() for external data file
For external data file, cluster allocations return an offset in the data
file and are not refcounted. In this case, there is nothing to do for
qcow2_alloc_cluster_abort(). Freeing the same offset in the qcow2 file
is wrong and causes crashes in the better case or image corruption in
the worse case.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200211094900.17315-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-22 11:27:53 +08:00
openeuler-ci-bot
d7b0ae52b5 !335 Automatically generate code patches with openeuler !164 !165
From: @kuhnchen18
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-07-21 17:27:49 +00:00
Chen Qun
3072b3a631 spec: Update release version with !164 !165
increase release verison by one

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
2021-07-21 21:27:22 +08:00
Chen Qun
e94ff0db36 spec: Update patch and changelog with !165 backport qemu-4.1 bugfix and CVE fix !165
block/curl: HTTP header fields allow whitespace around values
block/curl: HTTP header field names are case insensitive
backup: Improve error for bdrv_getlength() failure
mirror: Make sure that source and target size match
iotests/143: Create socket in $SOCK_DIR
nbd/server: Avoid long error message assertions CVE-2020-10761
block: Call attention to truncation of long NBD exports
qemu-img convert: Don't pre-zero images

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-07-21 21:27:22 +08:00
Chen Qun
ea6cf8b547 qemu-img convert: Don't pre-zero images
RH-Author: Kevin Wolf <kwolf@redhat.com>
Message-id: <20200731081831.13781-2-kwolf@redhat.com>
Patchwork-id: 98117
O-Subject: [RHEL-AV-8.2.1.z qemu-kvm PATCH 1/1] qemu-img convert: Don't pre-zero images
Bugzilla: 1861682
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>

Since commit 5a37b60a61c, qemu-img create will pre-zero the target image
if it isn't already zero-initialised (most importantly, for host block
devices, but also iscsi etc.), so that writing explicit zeros wouldn't
be necessary later.

This could speed up the operation significantly, in particular when the
source image file was only sparsely populated. However, it also means
that some block are written twice: Once when pre-zeroing them, and then
when they are overwritten with actual data. On a full image, the
pre-zeroing is wasted work because everything will be overwritten.

In practice, write_zeroes typically turns out faster than writing
explicit zero buffers, but slow enough that first zeroing everything and
then overwriting parts can be a significant net loss.

Meanwhile, qemu-img convert was rewritten in 690c7301600 and zero blocks
are now written to the target using bdrv_co_pwrite_zeroes() if the
target could be pre-zeroed. This way we already make use of the faster
write_zeroes operation, but avoid writing any blocks twice.

Remove the pre-zeroing because these days this former optimisation has
actually turned into a pessimisation in the common case.

Reported-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200622151203.35624-1-kwolf@redhat.com>
Tested-by:  Nir Soffer <nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit edafc70c0c8510862f2f213a3acf7067113bcd08)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
126dbf2602 block: Call attention to truncation of long NBD exports
RH-Author: Eric Blake <eblake@redhat.com>
Message-id: <20200610183202.3780750-3-eblake@redhat.com>
Patchwork-id: 97495
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH 2/2] block: Call attention to truncation of long NBD exports
Bugzilla: 1845384
RH-Acked-by: Sergio Lopez Pascual <slp@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

Commit 93676c88 relaxed our NBD client code to request export names up
to the NBD protocol maximum of 4096 bytes without NUL terminator, even
though the block layer can't store anything longer than 4096 bytes
including NUL terminator for display to the user.  Since this means
there are some export names where we have to truncate things, we can
at least try to make the truncation a bit more obvious for the user.
Note that in spite of the truncated display name, we can still
communicate with an NBD server using such a long export name; this was
deemed nicer than refusing to even connect to such a server (since the
server may not be under our control, and since determining our actual
length limits gets tricky when nbd://host:port/export and
nbd+unix:///export?socket=/path are themselves variable-length
expansions beyond the export name but count towards the block layer
name length).

Reported-by: Xueqiang Wei <xuwei@redhat.com>
Fixes: https://bugzilla.redhat.com/1843684
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200610163741.3745251-3-eblake@redhat.com>
(cherry picked from commit 5c86bdf1208916ece0b87e1151c9b48ee54faa3e)
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
b79f9f3239 nbd/server: Avoid long error message assertions CVE-2020-10761
RH-Author: Eric Blake <eblake@redhat.com>
Message-id: <20200610183202.3780750-2-eblake@redhat.com>
Patchwork-id: 97494
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH 1/2] nbd/server: Avoid long error message assertions CVE-2020-10761
Bugzilla: 1845384
RH-Acked-by: Sergio Lopez Pascual <slp@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

Ever since commit 36683283 (v2.8), the server code asserts that error
strings sent to the client are well-formed per the protocol by not
exceeding the maximum string length of 4096.  At the time the server
first started sending error messages, the assertion could not be
triggered, because messages were completely under our control.
However, over the years, we have added latent scenarios where a client
could trigger the server to attempt an error message that would
include the client's information if it passed other checks first:

- requesting NBD_OPT_INFO/GO on an export name that is not present
  (commit 0cfae925 in v2.12 echoes the name)

- requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
  not present (commit e7b1948d in v2.12 echoes the name)

At the time, those were still safe because we flagged names larger
than 256 bytes with a different message; but that changed in commit
93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
string limit.  (That commit also failed to change the magic number
4096 in nbd_negotiate_send_rep_err to the just-introduced named
constant.)  So with that commit, long client names appended to server
text can now trigger the assertion, and thus be used as a denial of
service attack against a server.  As a mitigating factor, if the
server requires TLS, the client cannot trigger the problematic paths
unless it first supplies TLS credentials, and such trusted clients are
less likely to try to intentionally crash the server.

We may later want to further sanitize the user-supplied strings we
place into our error messages, such as scrubbing out control
characters, but that is less important to the CVE fix, so it can be a
later patch to the new nbd_sanitize_name.

Consideration was given to changing the assertion in
nbd_negotiate_send_rep_verr to instead merely log a server error and
truncate the message, to avoid leaving a latent path that could
trigger a future CVE DoS on any new error message.  However, this
merely complicates the code for something that is already (correctly)
flagging coding errors, and now that we are aware of the long message
pitfall, we are less likely to introduce such errors in the future,
which would make such error handling dead code.

Reported-by: Xueqiang Wei <xuwei@redhat.com>
CC: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761
Fixes: 93676c88d7
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200610163741.3745251-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit 5c4fe018c025740fef4a0a4421e8162db0c3eefd)
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
ce754b00bd iotests/143: Create socket in $SOCK_DIR
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20191017133155.5327-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
4d0a386d5f mirror: Make sure that source and target size match
RH-Author: Kevin Wolf <kwolf@redhat.com>
Message-id: <20200603160325.67506-11-kwolf@redhat.com>
Patchwork-id: 97110
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH v2 10/11] mirror: Make sure that source and target size match
Bugzilla: 1778593
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>

If the target is shorter than the source, mirror would copy data until
it reaches the end of the target and then fail with an I/O error when
trying to write past the end.

If the target is longer than the source, the mirror job would complete
successfully, but the target wouldn't actually be an accurate copy of
the source image (it would contain some additional garbage at the end).

Fix this by checking that both images have the same size when the job
starts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200511135825.219437-4-kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e83dd6808c6e0975970f37b49b27cc37bb54eea8)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
a09d406df8 backup: Improve error for bdrv_getlength() failure
RH-Author: Kevin Wolf <kwolf@redhat.com>
Message-id: <20200603160325.67506-6-kwolf@redhat.com>
Patchwork-id: 97103
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH v2 05/11] backup: Improve error for bdrv_getlength() failure
Bugzilla: 1778593
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>

bdrv_get_device_name() will be an empty string with modern management
tools that don't use -drive. Use bdrv_get_device_or_node_name() instead
so that the node name is used if the BlockBackend is anonymous.

While at it, start with upper case to make the message consistent with
the rest of the function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20200430142755.315494-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 58226634c4b02af7b10862f7fbd3610a344bfb7f)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
03e474ad19 block/curl: HTTP header field names are case insensitive
RH-Author: Richard Jones <rjones@redhat.com>
Message-id: <20200528142737.17318-3-rjones@redhat.com>
Patchwork-id: 96895
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH 2/2] block/curl: HTTP header field names are case insensitive
Bugzilla: 1841038
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com>

From: David Edmondson <david.edmondson@oracle.com>

RFC 7230 section 3.2 indicates that HTTP header field names are case
insensitive.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20200224101310.101169-3-david.edmondson@oracle.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 69032253c33ae1774233c63cedf36d32242a85fc)
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
a3094362d2 block/curl: HTTP header fields allow whitespace around values
RH-Author: Richard Jones <rjones@redhat.com>
Message-id: <20200528142737.17318-2-rjones@redhat.com>
Patchwork-id: 96894
O-Subject: [RHEL-AV-8.2.1 qemu-kvm PATCH 1/2] block/curl: HTTP header fields allow whitespace around values
Bugzilla: 1841038
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: Danilo de Paula <ddepaula@redhat.com>

From: David Edmondson <david.edmondson@oracle.com>

RFC 7230 section 3.2 indicates that whitespace is permitted between
the field name and field value and after the field value.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Message-Id: <20200224101310.101169-2-david.edmondson@oracle.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
(cherry picked from commit 7788a319399f17476ff1dd43164c869e320820a2)
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
2021-07-21 21:27:22 +08:00
Chen Qun
7cdb42a8eb spec: Update patch and changelog with !164 qemu-4.1 bugfix !164
virtio: don't enable notifications during polling
usbredir: Prevent recursion in usbredir_write
xhci: recheck slot status
vhost: Add names to section rounded warning
vhost-user: Print unexpected slave message types
contrib/libvhost-user: Protect slave fd with mutex
libvhost-user: Fix some memtable remap cases
xics: Don't deassert outputs
i386: Resolve CPU models to v1 by default

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-07-21 21:27:20 +08:00
Chen Qun
d951287a06 i386: Resolve CPU models to v1 by default
When using `query-cpu-definitions` using `-machine none`,
QEMU is resolving all CPU models to their latest versions.  The
actual CPU model version being used by another machine type (e.g.
`pc-q35-4.0`) might be different.

In theory, this was OK because the correct CPU model
version is returned when using the correct `-machine` argument.

Except that in practice, this breaks libvirt expectations:
libvirt always use `-machine none` when checking if a CPU model
is runnable, because runnability is not expected to be affected
when the machine type is changed.

For example, when running on a Haswell host without TSX,
Haswell-v4 is runnable, but Haswell-v1 is not.  On those hosts,
`query-cpu-definitions` says Haswell is runnable if using
`-machine none`, but Haswell is actually not runnable using any
of the `pc-*` machine types (because they resolve Haswell to
Haswell-v1).  In other words, we're breaking the "runnability
guarantee" we promised to not break for a few releases (see
qemu-deprecated.texi).

To address this issue, change the default CPU model version to v1
on all machine types, so we make `query-cpu-definitions` output
when using `-machine none` match the results when using `pc-*`.
This will change in the future (the plan is to always return the
latest CPU model version if using `-machine none`), but only
after giving libvirt the opportunity to adapt.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1779078
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191205223339.764534-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-21 21:27:20 +08:00
Chen Qun
b5b6f23aae xics: Don't deassert outputs
The correct way to do this is to deassert the input pins on the CPU side.
This is the case since a previous change.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548862298.3650476.1228720391270249433.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-21 21:27:20 +08:00
Chen Qun
7c2cfdc326 libvhost-user: Fix some memtable remap cases
If a new setmemtable command comes in once the vhost threads are
running, it will remap the guests address space and the threads
will now be looking in the wrong place.

Fortunately we're running this command under lock, so we can
update the queue mappings so that threads will look in the new-right
place.

Note: This doesn't fix things that the threads might be doing
without a lock (e.g. a readv/writev!)  That's for another time.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-21 21:27:20 +08:00
Chen Qun
c597707bd4 contrib/libvhost-user: Protect slave fd with mutex
In future patches we'll be performing commands on the slave-fd driven
by commands on queues, since those queues will be driven by individual
threads we need to make sure they don't attempt to use the slave-fd
for multiple commands in parallel.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-21 21:27:20 +08:00
Chen Qun
eb5868b7f1 vhost-user: Print unexpected slave message types
When we receive an unexpected message type on the slave fd, print
the type.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-07-21 21:27:20 +08:00
Chen Qun
954d314d88 vhost: Add names to section rounded warning
Add the memory region names to section rounding/alignment
warnings.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200116202414.157959-2-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-21 21:27:20 +08:00
Chen Qun
51fcddf40b xhci: recheck slot status
Factor out slot status check into a helper function.  Add an additional
check after completing transfers.  This is needed in case a guest
queues multiple transfers in a row and a device unplug happens while
qemu processes them.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1786413
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200107083606.12393-1-kraxel@redhat.com
2021-07-21 21:27:20 +08:00
Chen Qun
c478f4c43c usbredir: Prevent recursion in usbredir_write
I've got a case where usbredir_write manages to call back into itself
via spice; this patch causes the recursion to fail (0 bytes) the write;
this seems to avoid the deadlock I was previously seeing.

I can't say I fully understand the interaction of usbredir and spice;
but there are a few similar guards in spice and usbredir
to catch other cases especially onces also related to spice_server_char_device_wakeup

This case seems to be triggered by repeated migration+repeated
reconnection of the viewer; but my debugging suggests the migration
finished before this hits.

The backtrace of the hang looks like:
  reds_handle_ticket
  reds_handle_other_links
  reds_channel_do_link
  red_channel_connect
  spicevmc_connect
  usbredir_create_parser
  usbredirparser_do_write
  usbredir_write
  qemu_chr_fe_write
  qemu_chr_write
  qemu_chr_write_buffer
  spice_chr_write
  spice_server_char_device_wakeup
  red_char_device_wakeup
  red_char_device_write_to_device
  vmc_write
  usbredirparser_do_write
  usbredir_write
  qemu_chr_fe_write
  qemu_chr_write
  qemu_chr_write_buffer
  qemu_mutex_lock_impl

and we fail as we lang through qemu_chr_write_buffer's lock
twice.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1752320

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20191218113012.13331-1-dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2021-07-21 21:27:20 +08:00
Chen Qun
7f06fac3c9 virtio: don't enable notifications during polling
Virtqueue notifications are not necessary during polling, so we disable
them.  This allows the guest driver to avoid MMIO vmexits.
Unfortunately the virtio-blk and virtio-scsi handler functions re-enable
notifications, defeating this optimization.

Fix virtio-blk and virtio-scsi emulation so they leave notifications
disabled.  The key thing to remember for correctness is that polling
always checks one last time after ending its loop, therefore it's safe
to lose the race when re-enabling notifications at the end of polling.

There is a measurable performance improvement of 5-10% with the null-co
block driver.  Real-life storage configurations will see a smaller
improvement because the MMIO vmexit overhead contributes less to
latency.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20191209210957.65087-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-21 21:27:20 +08:00
openeuler-ci-bot
14c130eda5 !334 x86: VMX features can be enabled/disabled via “-cpu” flags
From: @imxcc
Reviewed-by: @zhouli57,@kevinzhu1
Signed-off-by: @kevinzhu1
2021-07-21 08:51:30 +00:00
imxcc
26a68c032e x86: VMX features can be enable/disable via -cpu flags
migration: bugfix, multifd send pages next channel and
Make sure that we do not call write in case

Signed-off-by: imxcc <xingchaochao@huawei.com>
2021-07-21 15:59:20 +08:00
openeuler-ci-bot
ce9565bdc5 !332 Automatically generate code patches with openeuler !162
From: @kuhnchen18
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-07-20 13:48:13 +00:00
Chen Qun
8dd195e99a spec: Update release version with !162
increase release verison by one

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
2021-07-20 21:27:28 +08:00
Chen Qun
c2924fc4ee spec: Update patch and changelog with !162 block/crypto: improved performance for AES-XTS encryption for LUKS disk encryption !162
crypto: add support for nettle's native XTS impl
crypto: add support for gcrypt's native XTS impl
tests: benchmark crypto with fixed data size, not time period
tests: allow filtering crypto cipher benchmark tests

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-07-20 21:27:28 +08:00
Chen Qun
0035520720 tests: allow filtering crypto cipher benchmark tests
Add support for specifying a cipher mode and chunk size as argv to
filter which combinations are benchmarked. For example to only
benchmark XTS mode with 512 byte chunks:

  ./tests/benchmark-crypto-cipher xts 512

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-07-20 21:27:28 +08:00
Chen Qun
f55d31032c tests: benchmark crypto with fixed data size, not time period
Currently the crypto benchmarks are processing data in varying chunk
sizes, over a fixed time period. This turns out to be a terrible idea
because with small chunk sizes the overhead of checking the elapsed
time on each loop iteration masks the true performance.

Benchmarking over a fixed data size avoids the loop running any system
calls which can interfere with the performance measurements.

Before this change

Enc chunk 512 bytes 2283.47 MB/sec Dec chunk 512 bytes 2236.23 MB/sec OK
Enc chunk 4096 bytes 2744.97 MB/sec Dec chunk 4096 bytes 2614.71 MB/sec OK
Enc chunk 16384 bytes 2777.53 MB/sec Dec chunk 16384 bytes 2678.44 MB/sec OK
Enc chunk 65536 bytes 2809.34 MB/sec Dec chunk 65536 bytes 2699.47 MB/sec OK

After this change

Enc chunk 512 bytes 2058.22 MB/sec Dec chunk 512 bytes 2030.11 MB/sec OK
Enc chunk 4096 bytes 2699.27 MB/sec Dec chunk 4096 bytes 2573.78 MB/sec OK
Enc chunk 16384 bytes 2748.52 MB/sec Dec chunk 16384 bytes 2653.76 MB/sec OK
Enc chunk 65536 bytes 2814.08 MB/sec Dec chunk 65536 bytes 2712.74 MB/sec OK

The actual crypto performance hasn't changed, which shows how
significant the mis-measurement has been for small data sizes.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-07-20 21:27:28 +08:00
Chen Qun
460316109f crypto: add support for gcrypt's native XTS impl
Libgcrypt 1.8.0 added support for the XTS mode. Use this because long
term we wish to delete QEMU's XTS impl to avoid carrying private crypto
algorithm impls.

As an added benefit, using this improves performance from 531 MB/sec to
670 MB/sec, since we are avoiding several layers of function call
indirection.

This is even more noticable with the gcrypt builds in Fedora or RHEL-8
which have a non-upstream patch for FIPS mode which does mutex locking.
This is catastrophic for encryption performance with small block sizes,
meaning this patch improves encryption from 240 MB/sec to 670 MB/sec.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-07-20 21:27:28 +08:00
Chen Qun
ed784c457a crypto: add support for nettle's native XTS impl
Nettle 3.5.0 will add support for the XTS mode. Use this because long
term we wish to delete QEMU's XTS impl to avoid carrying private crypto
algorithm impls.

Unfortunately this degrades nettle performance from 612 MB/s to 568 MB/s
as nettle's XTS impl isn't so well optimized yet.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-07-20 21:27:28 +08:00
openeuler-ci-bot
dd894be433 !331 Automatically generate code patches with openeuler !161
From: @kuhnchen18
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-07-20 08:55:42 +00:00
Chen Qun
b43563c752 spec: Update release version with !161
increase release verison by one

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
2021-07-20 16:27:34 +08:00
Chen Qun
dadccf3646 spec: Update patch and changelog with !161 x86: new CPU models for Denverton (server-class Atom-based SoC), Snowridge, and Dhyana !161
target/i386: Introduce Denverton CPU model
target/i386: Add Snowridge-v2 (no MPX) CPU model
i386: Add CPUID bit for CLZERO and XSAVEERPTR

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-07-20 16:27:34 +08:00
Chen Qun
0b329b7ce3 i386: Add CPUID bit for CLZERO and XSAVEERPTR
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-20 16:27:33 +08:00
Chen Qun
f8b061efaa target/i386: Add Snowridge-v2 (no MPX) CPU model
Add new version of Snowridge CPU model that removes MPX feature.

MPX support is being phased out by Intel. GCC has dropped it, Linux kernel
and KVM are also going to do that in the future.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-20 16:27:33 +08:00
Chen Qun
3265d8ae1d target/i386: Introduce Denverton CPU model
Denverton is the Atom Processor of Intel Harrisonville platform.

For more information:
https://ark.intel.com/content/www/us/en/ark/products/\
codename/63508/denverton.html

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190718073405.28301-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-20 16:27:33 +08:00
openeuler-ci-bot
a0195a27fb !330 Automatically generate code patches with openeuler !152 !157
From: @kuhnchen18
Reviewed-by: @imxcc
Signed-off-by: @imxcc
2021-07-20 01:04:15 +00:00
Chen Qun
b405a3c6ee spec: Update release version with !152 !157
increase release verison by one

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
665079b389 spec: Update patch and changelog with !157 [feature]add support for AVX512_BF16 and new CPU model Cooperlake !157
x86: Intel AVX512_BF16 feature enabling
i386: Add MSR feature bit for MDS-NO
i386: Add macro for stibp
i386: Add new CPU model Cooperlake
target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
target/i386: Add missed security features to Cooperlake CPU model
target/i386: add PSCHANGE_NO bit for the ARCH_CAPABILITIES MSR
target/i386: Export TAA_NO bit to guests

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
3b6358a8ac target/i386: Export TAA_NO bit to guests
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).

Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
6713c545f7 target/i386: add PSCHANGE_NO bit for the ARCH_CAPABILITIES MSR
This is required to disable ITLB multihit mitigations in nested
hypervisors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
011ace1710 target/i386: Add missed security features to Cooperlake CPU model
It lacks two security feature bits in MSR_IA32_ARCH_CAPABILITIES in
current Cooperlake CPU model, so add them.

This is part of uptream commit 2dea9d9

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
c9a7e0fa18 target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
The bit 6, 7 and 8 of MSR_IA32_ARCH_CAPABILITIES are recently disclosed
for some security issues. Add the definitions for them to be used by named
CPU models.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191225063018.20038-2-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
4251256e51 i386: Add new CPU model Cooperlake
Cooper Lake is intel's successor to Cascade Lake, the new
CPU model inherits features from Cascadelake-Server, while
add one platform associated new feature: AVX512_BF16. Meanwhile,
add STIBP for speculative execution.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-4-git-send-email-cathy.zhang@intel.com>
Reviewed-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
b0fb8b4b34 i386: Add macro for stibp
stibp feature is already added through the following commit.
0e89165829

Add a macro for it to allow CPU models to report it when host supports.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
ea12d023b2 i386: Add MSR feature bit for MDS-NO
Define MSR_ARCH_CAP_MDS_NO in the IA32_ARCH_CAPABILITIES MSR to allow
CPU models to report the feature when host supports it.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
f08117fe03 x86: Intel AVX512_BF16 feature enabling
Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].

The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.

The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Jingyi Wang <wangjingyi11@huawei.com>
2021-07-19 21:29:25 +08:00
Chen Qun
1b38d7c3de spec: Update patch and changelog with !152 hw/net/rocker_of_dpa: fix double free bug of rocker device !152
hw/net/rocker_of_dpa: fix double free bug of rocker device

Signed-off-by: Chen Qun<kuhn.chenqun@huawei.com>
2021-07-19 21:29:23 +08:00